By Jaini Bhansali, under the guidance of Professor Nik Bear Brown
22nd April 2018
The impact of Neural Networks and deep Neural Networks has been immensely strong in the recent past. With the increased utilization of Deep Neural Networks there is parallely an increase to improve the performance of the deep neural network. This blog aims to help the user with the following:
Hyper Parameters control the performance of a deep neural network irrespective of the dataset . These parameters are set before the learning of the deep neural network begins. Given the hyper parameters , the algorithm learns the parameters from the data. The various Hyper parameters used include learning rate, number of epochs, Network initialization, Number of visible layers, gradient estimation to name a few.
There are many approaches available for hyperparameter tuning which are as follows:
I have implemented Manual Search Single Parameter and Manual Search Multiple Parameter in this blog.
A bias is used to ensure that even if there are no input features to yet provide a positive output Hence, the inputs and Weights are treated as Vectors and used as a dot product mathematically.
There are various kinds of activation functions like Sigmoid, Tanh (Hyperbolic Tangent), ReLU (Rectified Linear Unit) to name a few. Each of the activation functions have various applications in different scenarios. For example, Sigmoid functions are popularly used as they transform the input features as an output that lies between 0 and 1 , making it very popular for to represent probabilities. Activation functions are used to introduce nonlinearity into the network for better computation of the output.
In Simple words the loss function is used to tell us how wrong our predicted value is from the ground truth. The empirical Loss is the Loss calculated over the entire dataset. The loss sometimes referred using different names such as Objective Function, Cost Function or Empirical Risk.
This is mainly used with models that output a probability between 0 and 1 , i.e. mainly classification problems. Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. A perfect model will have a log loss of 0. As the predicted probability approached 1 the log loss decreases.
This is mainly used in regression models that output continuous variables. ( In Simple words as the values are further from the predicted value the Mean Squared Error Loss Increases) Hence, the Loss depicts the neural network and how well our Neural Network is doing. The aim would be to reduce the loss and make it minimum. Additionally, the loss is a function of Network Weights.
Since, the loss is a function of Network Weights, we can find the weights that contribute to minimum loss.

In the context of the above loss landscape, the learning rate measures how large the step is towards the descent. Setting the learning rate can be a challenge as it must not be too low that it gets lost in the local minima, also must not be too large such that the model would diverge and blow up.
Regularization constrains our optimization problem to discourage complex models and used to generalize our model on unseen data.
Process used to randomly drop neurons such that activation becomes 0. This ensures the the network does not randomly rely on a few neuron or provide larger weights to neurons.
Stop training before we have a chance to overfit. In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method. Such methods update the learner so as to make it better fit the training data with each iteration
The following neural networks and the repective datasets will be used for hyper parameter tuning
Each section of the blog provides a basic understanding of the deep neural network being used and explanation of the Neural Network Structure.
Following are the hyper parameters that have beem selected for tuning for each neural network model.



For more information to learn about various hyper parameters and deep neural networks refer to the below video tutorial
Introduction to Deep Nural Networks - MIT 6.S191 https://www.youtube.com/watch?v=a5BUunInTQU&t=1227s
INtroduction of Deep Learning Schedule http://introtodeeplearning.com/#schedule
An Overview of the definition of each hyper parameter used in the blog is provided in the following link https://github.com/jainibhansali/BDIA
The Perceptron is the fundamental building block of a Neural Network. A perceptron is a single Neuron in a Neural Network. The Multi Layer Percepton as the name suggests consists of multiple layers of the perceptrons. MLP's are a simple class of feed forward neural networks which consists of atleast 3 layers of nodes. It is trained by the process of back propagation and each layer of nodes except the input nodes are applied with a non linear activation function. MLP’s are very good classifier algorithm.
The Multi Layered Perceptron (MLP) model used consists of 10 hidden layers. Gaussian Initialization was used to initialize he weights and bias of the network. It was first initialized with a learning rate of 0.01, activation function Relu and using the Softmax Cross Entropy with Logits as the cost function. It traverses the gradient with the gradient estimation Adam Optimizer.
The Iris Dataset is a multivariate dataset. It consists of 3 species of Iris flowers (Iris Setosa, Iris Virginica, Iris Versicolor) of 50 samples each. The dataset consists of petal and sepal width and length respectively for each specie. Based on the 4 features we would be using the MLP Classifier to classify the Iris Specie type.
This dataset has been selected as the Iris Dataset is a popular Hello World Dataset that is used for classification. It would be a good dataset to perform hyper parameter tuning.
Download the Iris Dataset from the below link
https://archive.ics.uci.edu/ml/machine-learning-databases/iris/
Right click and on the web page select save.
While saving the file save it as 'iris.csv' in the path of the jupyter notebook.
A sample of the path in the Jupyter Notebook is shown below:

The tensorflow code involves creating placefolders 'X' and placeholders 'Y' that store the inputs and the output values of the Neural Network. Next the weights and bias are set for the visible and hidden layers for initializing the network . Following which we can set the hyper parameters which are the learning rate, number of epochs, number of hidden layers, cost and loss function for the network. In order to train the algorithm we must initilize the Tensorflow session. Withing this session we loop through each epoch to train the neural network and evaluate its performance on the basis of iris labels . Next, we evaluate the test set within the session.
Here, n_input=4 here as there are 4 features sepal length, sepal width, petal width and petal length
n_output = 3 has the algorithm is tasked to assign the out put to one of the three kinds of species.
Let see what the code looks like.
Follow the comments for explanation and understanding. First we preprocess the Iris Data
Lets import the necessary libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
import time
The Iris Dataset has 4 features the Sepal Length , Sepal Width, Petal Length and Petal Width. The labels which are the name of varios Iris Species are converted into one hot encoded arrays for easy reading of the algorithm. This is done using the label encode values.
Lets see how the dataset looks
#read the iris data into a dataframe
dataframe = pd.read_csv('Iris.csv')
dataframe
start_time=time.time()
# create a function to label encode each Iris Specie. According to the function the algorithm will identify
#Iris Setosa as [1,0,0], Iris Versicolor as [0,1,0] and Iris Virginica as [0,0,1]
def label_encode(label):
val=[]
if label == "Iris-setosa":
val = [1,0,0]
elif label == "Iris-versicolor":
val = [0,1,0]
elif label == "Iris-virginica":
val = [0,0,1]
return val
# next we assign each array to a variable
s=np.array([1,0,0])
ve=np.array([0,1,0])
vi=np.array([0,0,1])
# this array is then assigned to each specie in the dataset
dataframe['Species'] = dataframe['Species'].map({'Iris-setosa': s, 'Iris-versicolor': ve,'Iris-virginica':vi})
# we are rearranging the dataset to break the arrangement in the dataset
dataframe=dataframe.iloc[np.random.permutation(len(dataframe))]
#reset the index
dataframe=dataframe.reset_index(drop=True)
# dividing the data set into train and test
#train data
x_input=dataframe.ix[0:105,['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]
temp=dataframe['Species']
y_input=temp[0:106]
#test data
x_test=dataframe.ix[106:149,['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]
y_test=temp[106:150]
dataframe
We see that the dataset is now ready for the MLP model
Next we will define functions that will help plot the loss vs epoch and accuracy vs epoch. There are 3 lists that are used in the below functions. The Epoch_List, train_loss and train_accuracy stores number of epochs, training loss and the training accuracy respectively.
# plot train loss vs epoch
def plot_loss():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Train Loss vs Epoch', fontsize=15)
plt.plot(epoch_list, train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Train Loss')
# plot train accuracy vs epoch
def plot_accuracy():
plt.subplot(1, 2, 2)
plt.title('Train Accuracy vs Epoch', fontsize=15)
plt.plot(epoch_list, train_accuracy, 'b-')
plt.xlabel('Epoch')
plt.ylabel('Train Accuracy')
plt.show()
Now lets define the training model. The following steps are implemeted below :
Follow the comments to understand step by step
#The below three list will store values of epochs, training loss and testing loss
epoch_list=[]
train_accuracy=[]
train_loss=[]
# Define a function that defines that takes the model and apploes weights and bias
def model(x, weights, bias):
#weights and bias for the hidden layers
layer_1 = tf.add(tf.matmul(x, weights["hidden"]), bias["hidden"])
# appliy non linearity to the first layer
layer_1 = tf.nn.relu(layer_1)
# weights and bias applied to to the output layer
output_layer = tf.matmul(layer_1, weights["output"]) + bias["output"]
return output_layer
# Defining the Learning Rate and Number of epochs
learning_rate=0.01
training_epochs=1000
#display steps is defined to display results after 200 steps
display_steps=200
# n_input=4 here as there are 4 features sepal length, sepal width, petal width and petal length
n_input=4
n_hidden=10
#n_output=3 as there are 3 types of outputs as Iris Versicolor , Iris Setosa and iris virginica
n_output=3
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
#weights and biases
# Network Initialization
weights={
"hidden":tf.Variable(tf.random_normal([n_input,n_hidden],name="weight_hidden")),
"output" : tf.Variable(tf.random_normal([n_hidden, n_output]), name="weight_output")
}
bias = {
"hidden" : tf.Variable(tf.random_normal([n_hidden]), name="bias_hidden"),
"output" : tf.Variable(tf.random_normal([n_output]), name="bias_output")
}
# Call the function that applies the weights and bias to the model
pred = model(X, weights, bias)
#Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=Y))
# next, we reduce the cost of optimization
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
#Initialize Global Variables
init = tf.global_variables_initializer()
#start_time=time.time()
# start the tensorflow session
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
# each record of the training set is applied with the optimizer and the cost whicch is fed through the feed dict
_, c = sess.run([optimizer, cost], feed_dict={X: x_input, Y:[t for t in y_input.as_matrix()]})
# in order to track progress , print display steps
if(epoch + 1) % display_steps == 0:
print("Epoch: ", (epoch+1), "Cost: ", c)
# print("Optimization Finished!")
test_result = sess.run(pred, feed_dict={X: x_input})
# calculation within the tensor
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
# evaluating training accuracy
accuracy_final=accuracy.eval({X: x_input, Y:[t for t in y_input.as_matrix()]})
# print "Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]})
print ("Accuracy:", accuracy_final)
# append the epoch, trainin loss and training accuracy into list to visualize the graphs
epoch_list.append(epoch)
train_loss.append(c)
train_accuracy.append(accuracy_final)
# evaluating testing accuracy
test_result = sess.run(pred, feed_dict={X: x_test})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
print ("Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]}))
#Plot the graphs
plot_loss()
plot_accuracy()
end_time = time.time()
print ("Completed in ", end_time - start_time , " seconds")
With the enlisted parameters we observe that the loss decreases as the number of epochs increases which is the ideal scenario. We also observe that the accuracy also increases as the number of epochs increases. The Training accuracy = 97.76 % and Testing Accuracy is 97.72%. It is seen that the testing accuracy is slightly greater than the training accuracy that can be attributed to the random seed of the network
Lets start hyper parameter tuning. We will tune the activation functions, Gradient Estimation and a combination of number of epochs and learning rate
In the case of all tuning techniques it is observed that retraining the model might increase or decrease the accuracy slightly due to the random seed. The random seed has not been set to avoid creating a biased model.
The initial model was initialized with number of epochs as 1000 and learning rate as 0.01. Now we will observe the performance of the model with learning rate of 0.1 and number of epochs as 100.
Lets read and preprocess the data as we did earlier
dataframe = pd.read_csv('Iris.csv')
start_time=time.time()
def label_encode(label):
val=[]
if label == "Iris-setosa":
val = [1,0,0]
elif label == "Iris-versicolor":
val = [0,1,0]
elif label == "Iris-virginica":
val = [0,0,1]
return val
s=np.array([1,0,0])
ve=np.array([0,1,0])
vi=np.array([0,0,1])
dataframe['Species'] = dataframe['Species'].map({'Iris-setosa': s, 'Iris-versicolor': ve,'Iris-virginica':vi})
dataframe=dataframe.iloc[np.random.permutation(len(dataframe))]
dataframe=dataframe.reset_index(drop=True)
#train data
x_input=dataframe.ix[0:105,['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]
temp=dataframe['Species']
y_input=temp[0:106]
#test data
x_test=dataframe.ix[106:149,['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]
y_test=temp[106:150]
Edit the hyper parameter learning_rate =0.1 and set training_epochs= 100 with remaining code resued from the initial model.
epoch_list=[]
train_accuracy=[]
train_loss=[]
start_time=time.time()
#Defining the Multiple Layer Perceptron
def model(x, weights, bias):
layer_1 = tf.add(tf.matmul(x, weights["hidden"]), bias["hidden"])
layer_1 = tf.nn.relu(layer_1)
output_layer = tf.matmul(layer_1, weights["output"]) + bias["output"]
return output_layer
# hyperparameter tuning
learning_rate=0.1
training_epochs=100
display_steps=10
# network parameters
n_input=4
n_hidden=10
n_output=3
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
#weights and biases
weights={
"hidden":tf.Variable(tf.random_normal([n_input,n_hidden],name="weight_hidden")),
"output" : tf.Variable(tf.random_normal([n_hidden, n_output]), name="weight_output")
}
bias = {
"hidden" : tf.Variable(tf.random_normal([n_hidden]), name="bias_hidden"),
"output" : tf.Variable(tf.random_normal([n_output]), name="bias_output")
}
#Define Model
pred = model(X, weights, bias)
#Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
#Initialize Global Variables
init = tf.global_variables_initializer()
start_time=time.time()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
_, c = sess.run([optimizer, cost], feed_dict={X: x_input, Y:[t for t in y_input.as_matrix()]})
if(epoch + 1) % display_steps == 0:
print("Epoch: ", (epoch+1), "Cost: ", c)
# print("Optimization Finished!")
test_result = sess.run(pred, feed_dict={X: x_input})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
accuracy_final=accuracy.eval({X: x_input, Y:[t for t in y_input.as_matrix()]})
# print "Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]})
print ("Accuracy:", accuracy_final)
epoch_list.append(epoch)
train_loss.append(c)
train_accuracy.append(accuracy_final)
test_result = sess.run(pred, feed_dict={X: x_test})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
print ("Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]}))
plot_loss()
plot_accuracy()
end_time = time.time()
print ("Completed in ", end_time - start_time , " seconds")
In the first try as mentioned above I had a Learning Rate of 0.1 and number of epochs as 1000. Hence, I tried to tune these 2 parameters. It was observed that I reached a similar accuracy at a learning rate of 0.1 and number of epochs as 100. This also better as the Training accuracy is 99.05% and testing accuracy is 97.72%.
Training Accuracy =99.06% Testing accuracy= 97.72%
There is defintely an improvement in the training set performance. More over by reducing the number of epochs and increasing the learning rate helped computationally, Hence, reducing the steps to learn did not affect the accuracy in the case of the Iris Dataset.
Next change the optimizer to the Adagrad optimizer. Replace the AdamOptimizer with the Adagrad Optimizer
epoch_list=[]
train_accuracy=[]
train_loss=[]
start_time=time.time()
#Defining the Multiple Layer Perceptron
def model(x, weights, bias):
layer_1 = tf.add(tf.matmul(x, weights["hidden"]), bias["hidden"])
layer_1 = tf.nn.relu(layer_1)
output_layer = tf.matmul(layer_1, weights["output"]) + bias["output"]
return output_layer
# hyperparameter tuning
learning_rate=0.1
training_epochs=100
display_steps=10
# network parameters
n_input=4
n_hidden=10
n_output=3
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
#weights and biases
weights={
"hidden":tf.Variable(tf.random_normal([n_input,n_hidden],name="weight_hidden")),
"output" : tf.Variable(tf.random_normal([n_hidden, n_output]), name="weight_output")
}
bias = {
"hidden" : tf.Variable(tf.random_normal([n_hidden]), name="bias_hidden"),
"output" : tf.Variable(tf.random_normal([n_output]), name="bias_output")
}
#Define Model
pred = model(X, weights, bias)
#Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=Y))
optimizer = tf.train.AdagradOptimizer(learning_rate).minimize(cost)
#Initialize Global Variables
init = tf.global_variables_initializer()
start_time=time.time()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
_, c = sess.run([optimizer, cost], feed_dict={X: x_input, Y:[t for t in y_input.as_matrix()]})
if(epoch + 1) % display_steps == 0:
print("Epoch: ", (epoch+1), "Cost: ", c)
# print("Optimization Finished!")
test_result = sess.run(pred, feed_dict={X: x_input})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
accuracy_final=accuracy.eval({X: x_input, Y:[t for t in y_input.as_matrix()]})
# print "Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]})
print ("Accuracy:", accuracy_final)
epoch_list.append(epoch)
train_loss.append(c)
train_accuracy.append(accuracy_final)
test_result = sess.run(pred, feed_dict={X: x_test})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
print ("Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]}))
plot_loss()
plot_accuracy()
end_time = time.time()
print ("Completed in ", end_time - start_time , " seconds")
It is observed that the training accuracy decreased which could be attributed to random seed but the testing accuracy remains the same . hence, the Adagrad optimizer can be used as an option
Training Accuracy - 96.22% Test Accuracy - 97.72%
Reuse the code of the first model and replace the optimizer with AdadeltaOptimizer
epoch_list=[]
train_accuracy=[]
train_loss=[]
start_time=time.time()
#Defining the Multiple Layer Perceptron
def model(x, weights, bias):
layer_1 = tf.add(tf.matmul(x, weights["hidden"]), bias["hidden"])
layer_1 = tf.nn.relu(layer_1)
output_layer = tf.matmul(layer_1, weights["output"]) + bias["output"]
return output_layer
# hyperparameter tuning
learning_rate=0.1
training_epochs=100
display_steps=10
# network parameters
n_input=4
n_hidden=10
n_output=3
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
#weights and biases
weights={
"hidden":tf.Variable(tf.random_normal([n_input,n_hidden],name="weight_hidden")),
"output" : tf.Variable(tf.random_normal([n_hidden, n_output]), name="weight_output")
}
bias = {
"hidden" : tf.Variable(tf.random_normal([n_hidden]), name="bias_hidden"),
"output" : tf.Variable(tf.random_normal([n_output]), name="bias_output")
}
#Define Model
pred = model(X, weights, bias)
#Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=Y))
optimizer = tf.train.AdadeltaOptimizer(learning_rate).minimize(cost)
#Initialize Global Variables
init = tf.global_variables_initializer()
start_time=time.time()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
_, c = sess.run([optimizer, cost], feed_dict={X: x_input, Y:[t for t in y_input.as_matrix()]})
if(epoch + 1) % display_steps == 0:
print("Epoch: ", (epoch+1), "Cost: ", c)
# print("Optimization Finished!")
test_result = sess.run(pred, feed_dict={X: x_input})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
accuracy_final=accuracy.eval({X: x_input, Y:[t for t in y_input.as_matrix()]})
# print "Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]})
print ("Accuracy:", accuracy_final)
epoch_list.append(epoch)
train_loss.append(c)
train_accuracy.append(accuracy_final)
test_result = sess.run(pred, feed_dict={X: x_test})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
print ("Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]}))
plot_loss()
plot_accuracy()
end_time = time.time()
print ("Completed in ", end_time - start_time , " seconds")
It is observed that this affected the performance of the model and the network did not improve its performance.
Next , follow the same steps above and replace the optimizer to the Stochastic Gradient Descent Optimizer. You will reach the following results or a result close to the below results due to Random Seed
Testing Accuracy gradientdescent=86.36%
Adadelta=47.72%
Adagrad=97.72%
Hence, the best would be Adamoptimizer or the Adagrad Optimizer as these optimizers clearly oput performed the other optimizers
Activation Function
Now we will train the model for the following Activation Functions :
Reusing the code from the intial model , replace the activation tf.nn.relu with tf.nn.Sigmoid. All other parameters must be set to the intial model parameters.
Below is the a sample when I ran with activation function Sigmoid.
epoch_list=[]
train_accuracy=[]
train_loss=[]
start_time=time.time()
#Defining the Multiple Layer Perceptron
def model(x, weights, bias):
layer_1 = tf.add(tf.matmul(x, weights["hidden"]), bias["hidden"])
layer_1 = tf.nn.sigmoid(layer_1)
output_layer = tf.matmul(layer_1, weights["output"]) + bias["output"]
return output_layer
# hyperparameter tuning
learning_rate=0.1
training_epochs=100
display_steps=10
# network parameters
n_input=4
n_hidden=10
n_output=3
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
#weights and biases
weights={
"hidden":tf.Variable(tf.random_normal([n_input,n_hidden],name="weight_hidden")),
"output" : tf.Variable(tf.random_normal([n_hidden, n_output]), name="weight_output")
}
bias = {
"hidden" : tf.Variable(tf.random_normal([n_hidden]), name="bias_hidden"),
"output" : tf.Variable(tf.random_normal([n_output]), name="bias_output")
}
#Define Model
pred = model(X, weights, bias)
#Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
#Initialize Global Variables
init = tf.global_variables_initializer()
start_time=time.time()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
_, c = sess.run([optimizer, cost], feed_dict={X: x_input, Y:[t for t in y_input.as_matrix()]})
if(epoch + 1) % display_steps == 0:
print("Epoch: ", (epoch+1), "Cost: ", c)
# print("Optimization Finished!")
test_result = sess.run(pred, feed_dict={X: x_input})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
accuracy_final=accuracy.eval({X: x_input, Y:[t for t in y_input.as_matrix()]})
# print "Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]})
print ("Accuracy:", accuracy_final)
epoch_list.append(epoch)
train_loss.append(c)
train_accuracy.append(accuracy_final)
test_result = sess.run(pred, feed_dict={X: x_test})
correct_pred = tf.equal(tf.argmax(test_result, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, "float"))
print ("Accuracy Test:", accuracy.eval({X: x_test, Y:[t for t in y_test.as_matrix()]}))
plot_loss()
plot_accuracy()
end_time = time.time()
print ("Completed in ", end_time - start_time , " seconds")
Observation: It is observed that the sigmoid activation function provided a training accuracy of 98.11 and testing accuracy of 100%. Though this is a good result I would consider an average of the accuracy in attribution to the Random seed.
Theabove steps can be repeated for Tanh and Relu6 activation function. After performing these you should reach an accuracy similar to the below accuracy.
It is observed that tanh provided an accuracy of 50%, relu6 provided 61.36 and sigmoid=97.72%.
It is observed that Relu and sigmoid performed the best. It is observed that both can be taken into consideration while tuning activation functions to improve accuracy.
It is observed that tweeking the number of epochs and learning rate was an effective combination to tune. Reducing it to 100 and 0.1 epochs provided stability to the training and testing set. Though we must take into account the random seed generation
The Optimizer Adam and Adagrad worked the best and can be used for tuning
The Activation function Sigmoid and Relu worked the best as well
As a conclusion prospective hyper parameters to tune to improve the performance of the MLP Model are Activation, Optimization and a combination of number of epochs and leanring rate.
Summary

CNN are popularly used for image and video recognition. It is a deep feed forward neural network used for used for visual imagery. CNN's use very less pre processing , that is the algorithm learns from the filters in contrary to traditional algorithms. This is big advantage .
A CNN consists of a convolutional layer followed by an pooling layer is then fed with non linear activation function. The convolutional layer consists of a filters that learn every pixel of image. This is the applied to pooling which is a kind of down sampling. It tries to identify all the rough area of image over the exact area of the image to reduce parameters and computation in the network
This Neural Network consists 2 Convolutional Layers. Each Convolutional layer has an activation function sigmoid and an Average Pooling layer. The 2 Convolutional layers are preceded by the flattened layer and the fully connected layer. The model is a LENET model. It uses the Softmax Cross Entropy cost function and optimizer as Stochastic Gradient Descent. The model is initialized for 20000 epochs

The Cifar 10 dataset is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is majorly used for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships and trucks. There are 6000 images for each class. This hello world dataset is popularly used for Object Detection and Recognition.
The Cifar 10 dataset consists of images and their corresponding labels. There are 10 labels. First we will write functions to preprocess the dataset.
# function is randomize the dataset and shuffle
def randomize(dataset, labels):
permutation = np.random.permutation(labels.shape[0])
shuffled_dataset = dataset[permutation, :, :]
shuffled_labels = labels[permutation]
return shuffled_dataset, shuffled_labels
# one hot encode each pixel
def one_hot_encode(np_array):
return (np.arange(10) == np_array[:,None]).astype(np.float32)
# reformating the data for the image_width,image_height,image_depth
def reformat_data(dataset, labels, image_width, image_height, image_depth):
np_dataset_ = np.array([np.array(image_data).reshape(image_width, image_height, image_depth) for image_data in dataset])
np_labels_ = one_hot_encode(np.array(labels, dtype=np.float32))
np_dataset, np_labels = randomize(np_dataset_, np_labels_)
return np_dataset, np_labels
# Flattening the array
def flatten_tf_array(array):
shape = array.get_shape().as_list()
return tf.reshape(array, [shape[0], shape[1] * shape[2] * shape[3]])
# Calculating accuracy
def accuracy(predictions, labels):
return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1)) / predictions.shape[0])
Since, the data files are in the Jupyter Notebook path they will be read automatically Here , we read the images from the train and test set and reformat the images based on the dimensions selected. The dimensions selected are as follows :
c10_image_height = 32
c10_image_width = 32
c10_image_depth = 3
c10_num_labels = 10
Follow the comments to understand the code below
# Code attribution by https://github.com/taspinar/sidl
import pickle
import numpy as np
# This ensures that the code always reads the jupyter notebook path
cifar10_folder = './'
train_datasets = ['data_batch_1', 'data_batch_2', 'data_batch_3', 'data_batch_4', 'data_batch_5', ]
test_dataset = ['test_batch']
# images dimensions of CIFAR 10
c10_image_height = 32
c10_image_width = 32
c10_image_depth = 3
# there are 10 classes , hence 10 labels
c10_num_labels = 10
#Reading each image from train and test batches
with open(cifar10_folder + test_dataset[0], 'rb') as f0:
c10_test_dict = pickle.load(f0, encoding='bytes')
c10_test_dataset, c10_test_labels = c10_test_dict[b'data'], c10_test_dict[b'labels']
test_dataset_cifar10, test_labels_cifar10 = reformat_data(c10_test_dataset, c10_test_labels, c10_image_width, c10_image_height, c10_image_depth)
# images saved in train_dataset and labels saved in train_labels
c10_train_dataset, c10_train_labels = [], []
for train_dataset in train_datasets:
with open(cifar10_folder + train_dataset, 'rb') as f0:
c10_train_dict = pickle.load(f0, encoding='bytes')
c10_train_dataset_, c10_train_labels_ = c10_train_dict[b'data'], c10_train_dict[b'labels']
c10_train_dataset.append(c10_train_dataset_)
c10_train_labels += c10_train_labels_
# reformatting the training data
c10_train_dataset = np.concatenate(c10_train_dataset, axis=0)
train_dataset_cifar10, train_labels_cifar10 = reformat_data(c10_train_dataset, c10_train_labels, c10_image_width, c10_image_height, c10_image_depth)
del c10_train_dataset
del c10_train_labels
print("The training set contains the following labels: {}".format(np.unique(c10_train_dict[b'labels'])))
print('Training set shape', train_dataset_cifar10.shape, train_labels_cifar10.shape)
print('Test set shape', test_dataset_cifar10.shape, test_labels_cifar10.shape)
The code in the in this section namely Exploratory Data Analysis by Magnus Erik Hvass Pedersen is licensed under the MIT License
Lets look at each function that is being used visualize the CIFAR 10 Data. All functions beloew are used to visualize the data. FOllow the comments to understand each function
# Functions
#The code in the in this section namely Exploratory Data Analysis by Magnus Erik Hvass Pedersen is licensed under the MIT License
import math
import os
%matplotlib inline
import matplotlib.pyplot as plt
# Path where the CIFAR 10 files are present
data_path = "./"
# Various constants for the size of the images.
# Use these constants in your own program.
# Width and height of each image.
img_size = 32
# Number of channels in each image, 3 channels: Red, Green, Blue.
num_channels = 3
# Length of an image when flattened to a 1-dim array.
img_size_flat = img_size * img_size * num_channels
# Number of classes.
num_classes = 10
# Various constants used to allocate arrays of the correct size.
# Number of files for the training-set.
_num_files_train = 5
# Number of images for each batch-file in the training-set.
_images_per_file = 10000
# Total number of images in the training-set.
# This is used to pre-allocate arrays for efficiency.
_num_images_train = _num_files_train * _images_per_file
# This function is used to unpickle the Test and Train files and load the data. It reads each byte of the image
def _unpickle(filename):
"""
Unpickle the given file and return the data.
Note that the appropriate dir-name is prepended the filename.
"""
# Create full path for the file.
file_path = _get_file_path(filename)
print("Loading data: " + file_path)
with open(file_path, mode='rb') as file:
# In Python 3.X it is important to set the encoding,
# otherwise an exception is raised here.
data = pickle.load(file, encoding='bytes')
return data
# This function takes the raw image , reshapes it into arrays and returns the image
def _convert_images(raw):
"""
Convert images from the CIFAR-10 format and
return a 4-dim array with shape: [image_number, height, width, channel]
where the pixels are floats between 0.0 and 1.0.
"""
# Convert the raw images from the data-files to floating-points.
raw_float = np.array(raw, dtype=float) / 255.0
# Reshape the array to 4-dimensions.
images = raw_float.reshape([-1, num_channels, img_size, img_size])
# Reorder the indices of the array.
images = images.transpose([0, 2, 3, 1])
return images
# this function is used to load the data , unpickle and reshape the images
def _load_data(filename):
"""
Load a pickled data-file from the CIFAR-10 data-set
and return the converted images (see above) and the class-number
for each image.
"""
# Load the pickled data-file.
data = _unpickle(filename)
# Get the raw images.
raw_images = data[b'data']
# Get the class-numbers for each image. Convert to numpy-array.
cls = np.array(data[b'labels'])
# Convert the images.
images = _convert_images(raw_images)
return images, cls
# This function is used to convert the images into one hot encoded values
def one_hot_encoded_labels(class_numbers, num_classes=None):
if num_classes is None:
num_classes = np.max(class_numbers) + 1
return np.eye(num_classes, dtype=float)[class_numbers]
# we will be using the test batch to visualize the images , hence we read the test batch.
#This function then loads the images as one hot encoded values
def load_test_data():
"""
Load all the test-data for the CIFAR-10 data-set.
Returns the images, class-numbers and one-hot encoded class-labels.
"""
images, cls = _load_data(filename="test_batch")
return images, cls, one_hot_encoded_labels(class_numbers=cls, num_classes=num_classes)
# Get the path of file currently present
def _get_file_path(filename=""):
"""
Return the full path of a data-file for the data-set.
If filename=="" then return the directory of the files.
"""
return os.path.join(data_path, filename)
# Load class names. This function assigns the class name to each image
def load_class_names():
"""
Load the names for the classes in the CIFAR-10 data-set.
Returns a list with the names. Example: names[3] is the name
associated with class-number 3.
"""
# Load the class-names from the pickled file.
raw = _unpickle(filename="batches.meta")[b'label_names']
# Convert from binary strings.
names = [x.decode('utf-8') for x in raw]
return names
# Lets load the test batch using the load_test_data function that will retriece the image, labele and the array image
images_test, cls_test, labels_test = load_test_data()
# This function ised used to plt the image with the desired the image dimensions.
def plot_images(images, cls_true, cls_pred=None, smooth=True):
assert len(images) == len(cls_true) == 9
# Create figure with sub-plots.
fig, axes = plt.subplots(3, 3)
# Adjust vertical spacing if we need to print ensemble and best-net.
if cls_pred is None:
hspace = 0.3
else:
hspace = 0.6
fig.subplots_adjust(hspace=hspace, wspace=0.3)
for i, ax in enumerate(axes.flat):
# Interpolation type.
if smooth:
interpolation = 'spline16'
else:
interpolation = 'nearest'
# Plot image.
ax.imshow(images[i, :, :, :],
interpolation=interpolation)
# Name of the true class.
cls_true_name = class_names[cls_true[i]]
# Show true and predicted classes.
if cls_pred is None:
xlabel = "True: {0}".format(cls_true_name)
else:
# Name of the predicted class.
cls_pred_name = class_names[cls_pred[i]]
xlabel = "True: {0}\nPred: {1}".format(cls_true_name, cls_pred_name)
# Show the classes as the label on the x-axis.
ax.set_xlabel(xlabel)
# Remove ticks from the plot.
ax.set_xticks([])
ax.set_yticks([])
# Ensure the plot is shown correctly with multiple plots
# in a single Notebook cell.
plt.show()
# lets have a look at the classnames
class_names=load_class_names()
class_names
Finally lets try and visalize the image in the test set
# Get the first images from the test-set.
images = images_test[0:9]
# Get the true classes for those images.
cls_true = cls_test[0:9]
# Plot the images and labels using our helper-function above.
plot_images(images=images, cls_true=cls_true, smooth=False)
The above code can lso be used to visualize the number of wrong image labels predicted by the algorithm and the true labesl for visualizing. Currently not present.
CIFAR 10 has 10 classes of about 60000 images. The images can be classified into 10 classes 'airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck'. This dataset takes extremely long to train since there are around 60000 images. Of the various references take it is observed that it takes approximately 150000 epochs to get CIFAR10 ti an accuracy of 80%. Since, I did not have the computing power I trained the CNN for 20000 epochs which took me approximately 8 hours. In order to test various Hyper parameters I have used a benchmark of 7000 epochs as the CNN model with a Sigmoid Activation took 7000 epochs to get to accuracy of 42 %.
This CNN model has 2 Convloluted layers with the activation functions of Sigmoid and the pooling layer formed by Average Pooling. Followed by the 2 Convoluted layers is the layer for flattening followed by the fully connected layer. This also uses a activation of Sigmoid. The classification is done using Softmax. The optimizer used is Gradient Descent Optimizer. The Loss is calculted as softmax_cross_entropy_with_logits. The learning rate used is 0.5 The accuracy received is after training the model for 20000 epochs is 56% for training and 48% as testing accuracy. Time to train is almost 10 hours without any support from a GPU.
Now lets code the model. We first define the image dimensions. This is required for the filters as they must scan through the images. Next we create function that will create the weights and bias for each of the layers. Next the function model_lenet5 assigns the weights and bias for each of the layers. This model is known as LENET-5 model.
We will use the values obtained at the 700o epoch as the bench marck for comparing the effect of various hyper parameters tuned
import tensorflow as tf
# assign the dimensions
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
#Network Initialization
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
#first convolution layer, followed by the pooling layer and the activation function
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
#Second convolution layer, followed by the pooling layer and the activation function. It takes the first layer as input
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
#Flat layer followed by the fully connected layer. It takes the second convolution layer layer as input
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
# Last layer is the fully connected layer that rovides the output
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Next, we assign the various hyper parameters like number of epochs defined as num_steps, display_steps, learning_rate and batch size, loss and optimizer
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 20001
display_step = 200
learning_rate = 0.5
#batch size
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Lets run the tensorflow session.Since we have selected a batch size, there will be a calculation involved in providing a batch size to the network.
### running the tensorflow session
with tf.Session(graph=graph) as session:
### initialize all variables before the tensor flow graph is run
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
# creating batches
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
# display the train and test predictions
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
#evaluate the test predictions
test_accuracy = accuracy(test_prediction.eval(), test_labels)
message = "step {:04d} : loss is {:06.2f}, accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Since, I did not have the computing power I trained the CNN for 20000 epochs which took me approximately 8 hours. In order to test various Hyper parameters I have used a benchmark of 7000 epochs as the CNN model with a Sigmoid Activation took 7000 epochs to get to accuracy of 42 %.
Next lets dwelve into hyper parameter tuning to observe if the CNN performance can be improved. We will tuning activation functions, combination of loss and activation functions, number of epochs, gradient estimation, network architecture and network initialization
Observing any change in accuracy with change in activation functions. Since, training takes very long for 20000 epochs, the change in accuracy will be observed for 7001 epochs and we will notice any change in the network. I chose 7001 epochs as in the case of a sigmoid activation the accuracy reached 40%. Hence, I will use that as a benchmark and train the model with the below activation functions
Activation Functions that are considered are - tanh, relu
Lets strat by training the model with the activation of tanh. Reuse the code of the initial model and replace tf.nn.relu with tf.nn.tanh in the function named model_lenet5. Follow the commnets in the code to see the change in activation function
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
### Diffrent hyperparameters to tune
#change activation here to tanh. Replace tf.nn.relu
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.nn.tanh(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.nn.tanh(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.tanh(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.tanh(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the initial model code that was used to set the various hyper parameters, but set the num_steps as 7001. Follow the comments to see the change
from collections import defaultdict
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
#change num_steps to 7001
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code for the initializing the tensorflow session and run the code. Initlaize lists to store the training accuracy , testing accuracy and epochs displayed as list names train,test and display respectively
#Initlaize lists to store the training accuracy , testing accuracy and epochs
import pandas as pd
train=[]
test=[]
display=[]
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
# aplying loss and optimizer to the training set
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
# use the eval function to evelaute the accuracy of the test set
test_accuracy = accuracy(test_prediction.eval(), test_labels)
# append the values to the list to display the graphs of accuracy
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f}, accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Lets plot the model accuracy for the testing and training set
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
It is observed that in the case of activation function Tanh , the network did not plateau. Moreover, the test accuracy did not even improve more than 10% through all the epochs. Hence, in the case the network did not plateau and this activation function was not suitable for he network.
Train Accuracy= 22% Test Accuracy = 10%
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
### Diffrent hyperparameters to tune
#activation={'tanh' : tf.nn.tanh,'relu': tf.nn.relu, 'softplus' : tf.nn.softplus}
# change activation functions to relu in each instance of Sigmoid
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.nn.relu(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.nn.relu(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.relu(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.relu(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code to initial hyper parameters. Make sure number of epochs id 7001
from collections import defaultdict
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
#change number of epochs to 7001
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Initialize the tensor flow session and runeveluation on the training and testing set
train=[]
test=[]
display=[]
# init = tf.initialize_all_variables()
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:10.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
It is observed that similar to Tanh , there was no increase in Testing accuracy. Though it can be taken into consideration to run the CNN for larger number of epochs for sure. But Considering the use of Activation function Sigmoid provided an accuracy of 42% at 7000 epochs, Tanh and ReLu did not provide the same. Even in this case the network did not plateau and was not even close to plateau.
Train Accuracy = 14% Test Accuracy =10%
Next, we will try a combination of Loss function and Activation function
Lets try a combination of Hinge Loss and Activation Function Tanh.
Hinge Loss is a loss used to evaluate classifiers . It is used as a maximum margin classification.
Use the initila model defined and replay the activation function in the model_lenet5 function to Tanh. Follow the comments observe the change in the below code
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
### Diffrent hyperparameters to tune, replace sigmoid with tanh in the below function for all occurences
# activation={'tanh' : tf.nn.tanh,'relu': tf.nn.relu, 'softplus' : tf.nn.softplus}
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.nn.tanh(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.nn.tanh(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.tanh(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.tanh(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Now, we will change the cost function to accomadate hinge loss. Follow the comments to observe the change made.
from collections import defaultdict
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
# Chenge the softmax function to incorporate Hinge Loss
#then we compute the hinge loss between the logits and the (actual) labels
# the loss is then reduced by tf.losses.mean_squared_error
loss = tf.reduce_mean(tf.losses.hinge_loss(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code to run the tensorflow session as per the initial mode. Ensure num_steps=7001
import pandas as pd
train=[]
test=[]
display=[]
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f}, accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz%0AAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXd4XNWZ8H9nRr13ybJsyb33gokpphlCFlgSOoSSACks%0AJF/CtyTZbCDZ5EtZlrDZDZBASCGEZlpCQjFggw3GuPduSVaxVUZ9RnV0vj/OvaOZ0ZQ7kkbFPr/n%0AmWdmbn1nNLrvfbuQUqLRaDSaMxfbSAug0Wg0mpFFKwKNRqM5w9GKQKPRaM5wtCLQaDSaMxytCDQa%0AjeYMRysCjUajOcPRikBz2iKEKBFCSCFEjIVtbxdCbBwOuTSa0YZWBJpRgRCiTAjRJYTI8Vu+w7iY%0Al4yMZD6ypAgh2oQQb460LBrNUKIVgWY0UQrcaL4RQswDkkZOnH58AegELhFCFAznia1YNRrNQNGK%0AQDOaeAa41ev9bcCfvDcQQqQLIf4khKgTQpQLIb4vhLAZ6+xCiIeFEPVCiOPA5wLs+zshxEkhRJUQ%0A4sdCCHsE8t0GPAHsBm7xO/YEIcQrhlwOIcT/eq27SwhxQAjRKoTYL4RYbCyXQoipXtv9QQjxY+P1%0AKiFEpRDiASHEKeD3QohMIcQbxjkajddFXvtnCSF+L4SoNta/ZizfK4S4wmu7WOM7WhTBZ9ecxmhF%0AoBlNfAKkCSFmGRfoG4A/+23zP0A6MBk4H6U47jDW3QX8E7AIWApc47fvH4AeYKqxzWrgTiuCCSGK%0AgVXAs8bjVq91duANoBwoAcYDzxvrrgUeMrZPA64EHFbOCRQAWUAxcDfq//X3xvuJQDvwv17bP4Oy%0AoOYAecAvjeV/wldxXQ6clFLusCiH5nRHSqkf+jHiD6AMuBj4PvBT4DJgLRADSNQF1g50AbO99vsK%0AsN54/T7wVa91q419Y4B8lFsn0Wv9jcA64/XtwMYQ8n0f2Gm8Hg+4gUXG+7OBOiAmwH5vA98IckwJ%0ATPV6/wfgx8brVcZnTQgh00Kg0Xg9DugFMgNsVwi0AmnG+zXAv47031w/Rs9D+x01o41ngA+BSfi5%0AhYAcIBZ1521Sjrowg7rgVfitMyk29j0phDCX2fy2D8WtwJMAUsoqIcQHKFfRDmACUC6l7Amw3wTg%0AmMVz+FMnpeww3wghklB3+ZcBmcbiVMMimQA0SCkb/Q8ipawWQnwEfEEI8SrwWeAbA5RJcxqiXUOa%0AUYWUshwVNL4ceMVvdT3Qjbqom0wEqozXJ1EXRO91JhUoiyBHSplhPNKklHPCySSE+AwwDfiuEOKU%0A4bM/C7jJCOJWABODBHQrgClBDu3CNxjuH4D2bw38bWAGcJaUMg04zxTROE+WECIjyLn+iHIPXQts%0AklJWBdlOcwaiFYFmNPJl4EIppdN7oZTSDbwI/EQIkWr47b9FXxzhReA+IUSRECIT+I7XvieBd4D/%0AEkKkCSFsQogpQojzLchzG8pNNRvljlkIzAUSUXfXn6KU0M+EEMlCiAQhxEpj36eA+4UQS4RiqiE3%0AwE6UMrELIS5DxTxCkYqKCzQJIbKAB/0+35vAY0ZQOVYIcZ7Xvq8Bi1GWgL+lpTnD0YpAM+qQUh6T%0AUm4NsvpewAkcBzYCfwGeNtY9ifLJ7wK209+iuBWIA/YDjShf+bhQsgghEoDrgP+RUp7yepSi3Fi3%0AGQrqClQQ+gRQCVxvfJaXgJ8YcraiLshZxuG/YezXBNxsrAvFoyjlU48KrL/lt/6LKIvpIFALfNNc%0AIaVsB15Gudz8vxfNGY6QUg+m0WjOBIQQPwCmSylvCbux5oxCB4s1mjMAw5X0ZZTVoNH4oF1DGs1p%0AjhDiLlQw+U0p5YcjLY9m9KFdQxqNRnOGoy0CjUajOcMZEzGCnJwcWVJSMqB9nU4nycnJQytQFBlL%0A8o4lWWFsyTuWZIWxJe9YkhUGJ++2bdvqpZS5YTcc6dJmK48lS5bIgbJu3boB7zsSjCV5x5KsUo4t%0AeceSrFKOLXnHkqxSDk5eYKu0cI3VriGNRqM5w9GKQKPRaM5wtCLQaDSaM5wxESwORHd3N5WVlXR0%0AdITcLj09nQMHDgyTVINnpORNSEigqKiI2NjYYT+3RqMZWcasIqisrCQ1NZWSkhK82gr3o7W1ldTU%0A1GGUbHCMhLxSShwOB5WVlUyaNGlYz63RaEaeMesa6ujoIDs7O6QS0FhDCEF2dnZY60qj0ZyejFlF%0AAGglMITo71KjOXMZ04pAo9FoxixH34OG4yMtBaAVwbCRkpICQHV1Nddc4z9TXbFq1Sq2b98e8jiP%0APvooLpfL8/7yyy+nqalp6ATVaDTDw5ovwcf/M9JSAFoRDDuFhYWsWbNmwPv7K4J//OMfZGQEm06o%0A0WhGJT1d0NEEHS0jLQmgFcGA+c53vsOvf/1rz/uHHnqIH//4x1x00UUsXryYefPm8frrr/fbr6ys%0AjLlz5wLQ3t7ODTfcwKxZs7j66qtpb2/3bPe1r32NpUuXMmfOHB58UE0k/NWvfkV1dTUXXHABF1xw%0AAQAlJSXU19cD8MgjjzB37lzmzp3Lo48+6jnfrFmzuOuuu5gzZw6rV6/2OY9GoxkB2hvUc5cz9HbD%0AxJhNH/Xmh3/bx/7qwJrV7XZjt9sjPubswjQevCL4XPPrr7+eb37zm9xzzz0AvPjii7z99tvcd999%0ApKWlUV9fz4oVK7jyyiuDBmIff/xxkpKSOHDgALt372bx4sWedT/5yU/IysrC7XZz0UUXsXv3bu67%0A7z4eeeQR1q1bR05Ojs+xtm3bxu9//3s2b96MlJKzzjqL888/n8zMTI4cOcJzzz3Hk08+yXXXXcfL%0AL7/MLbfoIVUazYjhVDdvdLWNrBwG2iIYIIsWLaK2tpbq6mp27dpFZmYmBQUFfO9732P+/PlcfPHF%0AVFVVUVNTE/QYH374oeeCPH/+fObPn+9Z9+KLL7J48WIWLVrEvn372L9/f0h5Nm7cyNVXX01ycjIp%0AKSl8/vOfZ8OGDQBMmjSJhQsXArBkyRLKysoG+ek1Gs2gcDnU8yhRBKeFRRDqzj2aBVrXXnsta9as%0A4dSpU1x//fU8++yz1NXVsW3bNmJjYykpKRlQbn5paSkPP/wwW7ZsITMzk9tvv31QOf7x8fGe13a7%0AXbuGNJqRxmVYBJ2jQxFoi2AQXH/99Tz//POsWbOGa6+9lubmZvLy8oiNjWXdunWUl5eH3P+8887j%0AL3/5CwB79+5l9+7dALS0tJCcnEx6ejo1NTW8+eabnn1SU1NpbW3td6xzzz2X1157DZfLhdPp5NVX%0AX+Xcc88dwk+r0WiGDJeOEZw2zJkzh9bWVsaPH8+4ceO4+eabueKKK5g3bx5Lly5l5syZIff/2te+%0Axh133MGsWbOYNWsWS5YsAWDBggUsWrSImTNnMmHCBFauXOnZ5+677+ayyy6jsLCQdevWeZYvXryY%0A22+/neXLlwNw5513smjRIu0G0mhGI6MsRqAVwSDZs2eP53VOTg6bNm0KuF1bm/qDl5SUsHfvXgAS%0AExN5/vnnfbYz7/b/8Ic/BDzOvffey7333ut5732h/9a3vsW3vvUtn+29zwdw//33h/lEGo0m6njH%0ACKSEEa7s164hjUajGW7MGIHshe6Rj9lpRaDRaDTDjWkRwKhwD2lFoNFoNMONUysCjUajObNxOSAx%0AU70eBSmkUVMEQogEIcSnQohdQoh9QogfGssfEkJUCSF2Go/LoyWDRqPRjDqkVIogo1i9HwUppNHM%0AGuoELpRStgkhYoGNQggzIf6XUsqHo3hujUajGZ10tkBvN2QWw8mdp7drSCrMTxhrPGS0zjfcNDU1%0A8dhjj0W8n5W20T/4wQ949913ByqaRqMZzZiBYo9FMPKKQEgZvWuzEMIObAOmAr+WUj4ghHgIuANo%0ABrYC35ZSNgbY927gboD8/Pwl/vn26enpTJ06NawMA206F47y8nKuu+46Nm/e7LO8p6eHmJiBG1rR%0AktcKR48epbm52fL2bW1tnjkLY4GxJO9YkhXGlrwjLWta8yEW7/hXDk/7KtOPPMHBGfdyatzFQbcf%0AjLwXXHDBNinl0rAbSimj/gAygHXAXCAfsKOskZ8AT4fbf8mSJdKf/fv391sWiJaWFkvbRcr1118v%0AExIS5IIFC+TSpUvlOeecI6+44go5bdo0KaWUV111lVy8eLGcPXu2/M1vfuPZr7i4WNbV1cnS0lI5%0Ac+ZMeeedd8rZs2fLSy65RLpcLtnS0iJvu+02+dJLL3m2/8EPfiAXLVok586dKw8cOCCllLK2tlZe%0AfPHFcvbs2fLLX/6ynDhxoqyrqxvUZ7L6nZqsW7duUOcbbsaSvGNJVinHlrwjLuvBN6V8ME3Kw2vV%0A86bHQ24+GHmBrdLCNXpYKoullE1CiHXAZdIrNiCEeBJ4Y9AnePM7cGpPwFWJ7h6wD+BjFsyDz/4s%0A6Oqf/exn7N27l507d7J+/Xo+97nPsXfvXiZNmgTA008/TVZWFu3t7SxbtowvfOELZGdn+xwjUHvo%0Aq666qt+5cnJy2L59O4899hgPP/wwTz31FD/84Q+58MIL+e53v8tbb73F7373u8g/o0ajGX7MYrJM%0A0zXUv3fYcBPNrKFcIUSG8ToRuAQ4KIQY57XZ1cDeQPuPNZYvX+5RAqCGyCxYsIAVK1ZQUVHBkSNH%0A+u1jtT305z//+X7bbNy4kRtuuAGAyy67jMzMzCH8NBrNGYqU8M73oXJb9M5hxghSx4Et9rTPGhoH%0A/NGIE9iAF6WUbwghnhFCLEQFjsuArwz6TCHu3Nuj2Ibam+TkZM/r9evX8+6777Jp0yaSkpJYtWpV%0AwDbSVttDm9vZ7XZ6enqGWHKNRuOh9aSaIywlFC2Jzjmc9WCPh7hkiE8ZFXUEUVMEUsrdwKIAy78Y%0ArXMOJ8HaQQM0NzeTmZlJUlISBw8e5JNPPhny869cuZIXX3yRBx54gHfeeYfGxn7xdo1GEyk1xgCo%0AtuADpQaNqwGSc1SjubiU094iOK3Jzs5m5cqVzJ07l8TERPLz8z3rLrvsMp544glmzZrFjBkzWLFi%0AxZCf/8EHH+TGG2/kmWee4eyzz6agoGBYLB+N5rSmdp96bj0VvXO46iEpS72OSxkVMQKtCAaBOVTG%0An/j4eJ9hMt6YPv6cnJyA7aFbW1t9WlB7xw2WLl3K+vXrAZU++/bbbxMTE8OmTZvYsmWLj6tJo9EM%0AgGGxCByQZMwcj0vWFoFm4Jw4cYLrrruO3t5e4uLiePLJJ0daJI1m7OOxCKKoCJz1fcVkp3uMQBNd%0Apk2bxo4dO0ZaDI3m9MHdA3WHwR4Hnc1qTkBs4tCfx4wRgHINtdUN/TkiZEx3H5VRrIo+09DfpeaM%0Ap+EYuDthohHTi0acoKdLKZkko6ZolMQIxqwiSEhIwOFw6AvYECClxOFwkJCQMNKiaDQjR43hFppy%0AoXqORpyg3Rha71EEOkYwKIqKiqisrKSuLrRZ1dHRMaYucCMlb0JCAkVFRcN+Xo1m1FC7H4QdJp2n%0A3kfDIjCH1puKQMcIBkdsbKxPJW8w1q9fz6JF/coZRi1jTV7Nac4//hXyZsLSL42sHD2d8PzNcMF3%0AYXyUCr1q9kP2lL5AbjQsArOq2DtG4O4EdzfYY4f+fBYZs4pAo9EMA7ufB2GD+TdAXNLIyVF3EI6u%0AhZxp0VMEtftg3EJIzAJbTHQsApefRRBndBXtauubWDYCjNkYgUajiTI9XdDRDO2NsPuFkZXFcUw9%0AV26JzvE726CxjK6cWVzy6Aba43OiZBEEiBHAiMcJtCLQaDSBcXrF3z55XPXfGSlMRXByl3ITDTV1%0ABwH4uDWfI7VtOERmdGMEiUZlcbxhEYxwnEArAo1GExhTEcy+CuoPwdH3Rk6WBkMRuLvg5O6hP76R%0AMfTUYZWoUSczouQackBCRl9rfI9rSFsEGo1mNGIqguV3Q0oBfBL5aNYhw3EUcqar19FwD9Xupycm%0AiY8cKSTF2al2p0NblGIEZqAYvBTByNYSaEWg0WgC01arntMKYfmdcOw9qD04MrI4jkHxZyB9QnQU%0AQc0+ysQE8tOSuGphIeVdqeruvadraM/jcvTFB0DHCDQazSjHtAiS82DJlyAmYWSsAleDKsTKngpF%0Ay4ZeEUhJz6m9bGkv5NbPFDMxK5mK7jS1zlk7tOdyejWcA4g3OgbrGIFGoxmVOOsgNkkFNJOzYf71%0AKnvI6RheORqOq2dTETRXQMvJoTt+Ww0xHY0ct03kpuUTKcxIoFZmqHVD3XzO5ehrQQ1eFoFWBBqN%0AZjTirPP1Z6/4OvR0wLanh1cOx1H1nDVFKQKAqq1Ddvjm8p0A5E9bQkZSHIUZidRKI6d/KOMEUipF%0AEDBGoBWBRqMZahqOw68WQXPlwI/RVqvcQiZ5M1Ufnk+fGnrfeSgcx1RRW2YJjJuvuoNWfDpkh9+1%0AdRMAF51/AQDj0r0tgiFUBJ0t0NvtGyOINYr0dIxAo9EMOZXblDI4uWvgx3DWQ3Ku77IV96i75H2v%0ADk6+SHAchYyJEBMHMfEwbgFUDo1F0Nnjprl8J032LCZNnAhAfloCDpGORAy4qKyts4eaFr855Z4+%0AQ14Wgc2mrAIdI9BoNENOS5V6HswdrbMWUvwUwdSLIGcGfPLr4Sswazim4gMmRcugeofqzzNI/rqz%0AmhJ3GTJ3tmdZrN1GTmoybfaB1xL8/M2DfP6xj327I/tXFZvEpWjXkEajiQKmIhhom4Te3sAWgRCw%0A4qvK0jixaXAyWkFK5RrKmtK3rGgZ9LT3tY0e8KElf9h4jOm2ajJKFvisG5eRgMOWOeDv71BNK1VN%0A7ZzytgrMPkPJ/oogWSsCjUYTBVqq1fNALYL2RpBu3xiByfwbVIO0Tb8euHxWaatRF0l/iwAGnUa6%0A6biD9pojxNOFyJ/js64wI5Ga3oFbBBUNLgB2Vzb3LTQ7j/pbBPEpOkag0WiiwGAtAk8NQU7/dXFJ%0AsOQOOPh3aCgd2PGtYvYYyp7ctyy9SFU6D1IRPL2xlKWJRhpq/myfdYXpCVR2pyEH8P119rg9lsAe%0Ab0UQKEYAp3eMQAiRIIT4VAixSwixTwjxQ2N5lhBirRDiiPE8cr1XNZrTlcFaBGYhVUoAiwBg+V1g%0As8Onvx3Y8a1ipo56WwRCQNHSQSmC0non7x2s5erxzSojKXemz/px6Ymc7E1XmVO97oiOXdXY7gmf%0A7K7yswjs8X21AyaneYygE7hQSrkAWAhcJoRYAXwHeE9KOQ14z3iv0WiGip6uvvYQg7YIcgOvTyuE%0AOVfD9mego2Vg57BCwzGVLpo+wXf5hOUqK8q8y46Q339USqzNxuL4asia3G9IvaolyEBId59LxyIn%0ADLfQ5Nxk9lQ29QWMzRoCIXx3OJ1jBFJhfrpY4yGBq4A/Gsv/CPxztGTQaM5I2k4BUvn3B3BHq47h%0A1V4iGCu+rpql7fjzgMS0hOMYZE5S1oc3njhB5Gmkzm7JS1sruXJhIfENByFvdr9tVHWx4ayI0Koy%0A4wOfmzeORlc3lY3taoV/VbHJKIgRRHVCmRDCDmwDpgK/llJuFkLkSynN+vBTQH6Qfe8G7gbIz89n%0A/fr1A5Khra1twPuOBGNJ3rEkK4wteQcja3rTfhYB9QnF5Dhr+ejdv9EdlxHRMSYd38pEbHzw6S7l%0AOgnCwvTZxH/wKG1zHo7Kd7vsxG7aE8ex1+/YNncn5wg7FR+tofRkZDO+1x5z0t4tWBxTiWwopSzt%0ALMr9jt/SKT1FZbs/XktDdoPl4390sJNYG2S4VDHf8+98zLKCGBadKsVtT2S337mm1DYyztXExiDf%0A33D8bqOqCKSUbmChECIDeFUIMddvvRRCBExGllL+FvgtwNKlS+WqVasGJMP69esZ6L4jwViSdyzJ%0ACmNL3kHJuqcedkLO/NWwbgsr501WFbmR0PIyNOSy6oILQ2+X9x148VYmdh1m/qr7ByZvMHp7YUMN%0AyQuvCvxdHJ1Hsb2G4gi+px53L99a/xZnT87ipuWxsEsyafnlTJrtewwpJY9uOAHA/El5sNj6OZ6v%0A2MbE7FZu+adz+dmWt+nNKGLVqpmwqwvGz+v/WXo/gso3WHX++f3dRgzP73ZYsoaklE3AOuAyoEYI%0AMQ7AeB7i9n4azRmOGSguXKyeBxInCFRDEIjpn4W4VHLqB9fyYdMxB9tPNPoubKlUg929A8XeFC2D%0Aqu0Rub7e3HuKhg7Jl8+ZpIbVA/iljgIIIbCnFag3EfYbOtHgYmJWEvExdmYUpPZlDrkagmRhJQMS%0Aul0RnWcoiWbWUK5hCSCESAQuAQ4CfwVuMza7DXg9WjJoNGckLdUQlwo5xgV0IJlDbbXWFEFMHExZ%0ARVbD1gFXGte2dvDlP27hlqc2c7zOK2jq3WwuEEXLVJC1zvqMhD9+XEZ+kuDCmXlQux9iElUPowDk%0AZKTTJlIi6kAqpaTCUAQA88ZnsLuyCdnTCZ3N/WsIoG9c5QjGCaJpEYwD1gkhdgNbgLVSyjeAnwGX%0ACCGOABcb7zUazVDRUqWyelIGdkcLGJ1HLSgCgGmXktDpgJq9kZ8HePTdI3T19BIXY+Oev+ygo9u4%0Aw/fUEASzCJaqZ4sN6I7WtrK1vJFVE2Kx2YSqTM6b2T8QbTAuI4E6MiL6/prbu2nt7GGCoQjmF6XT%0A0tFDVbVR1xFIEZgdSDtHbkpZNLOGdkspF0kp50sp50opf2Qsd0gpL5JSTpNSXiyltB6F0Wg04Wmp%0AVoogNgES0gfWU99ZF7yGwJ9pq9Xz4bcjPs3R2lZe2FLBLSuKeeS6BRw42cKP/264bBzHIDYZUgsC%0A75w1WV1YLWYOPf9pBbF2wcpCIzRaux/y+ruFTMZnJHLSnY6MwKIyU0cneCyCdACOlpWrDUIpgtPU%0AItBoNCNBSzWkjVevUwoitwi6nMpfHcif7Udvr2TN4W6aU6bAkXciFvXnbx0iMdbOvRdO5cKZ+dx9%0A3mT+/MkJ/r77pHINZU8OGEAFjMIyaxPLOnvcvLy9kktm55MWL1R6rLMuYHzAZFx6IjUyA3dL5IrA%0AdA1Nz08lLsZGVbXRDjxojIARrSXQikCjOZ1w96gLf7qhCFLzI7cIzGK0UDUEBp+UOrj/pV28070Q%0AWbmlr8OmBT4tbWDt/hq+tmoK2SnxAPzfS2ewcEIG33l5N911R4O7hUyKlkL9IdUbKQRr99fQ6Orm%0A+mWq1TS1RsO6/P41BCbmpDJbW43l+Ie/RRAXY2PWuDTqa0K4hsxxldoi0Gg0Q0LbKZC9yjUEA7MI%0AzKpiC66hT0vVhf/PLQsQsheOvmvpFFJK/t8/DpCfFs+XVk7yLI+12/ifGxcRK3qwNZfjzpgc4ihA%0A0XL1XLUt5GbPf1rB+IxEzp1q3JGbGUMhXEPmpDJbbxd0NIX9TAAVDe1kJceREt+XmT9/fDqtDYYy%0A9u8zBH0WwekYI9BoNCOAmTqa5mcRRJLRE6rhnB9byhqYWZBKb9Y06mU6Lbv/ZukUb+49xc6KJr59%0AyQwS43yDtROyknj00izs9PK3qqTQBxq/GBAh4wQVDS42Hq3nuqUTVJAYlEWQnNt/3oIXvpPKrFlV%0AFQ0ujzVgMq8onRS3oUgSA7RW0zECjUYzpJhdR70tAnen5TtawLJrqNvdy/byJs6alMWd8xP5xLYI%0AcfQ9XB0dIffr6unl528dZEZ+Kl9YUhRwm/OyVO79nw7ZWbs/xEU4PlW1iAgRJ3hhSwU2Adcu9TpX%0Azf6ArSW8SU2IpS3OUIYWraoTXqmjJvOL0smkla7YNLAHqOHVMQKNRjOkeCwCQxGYGTeRxAnMRm5h%0ALIJ91S20d7tZNimL9HjBlHOuIRUnf3rxpZD7/WVzOeUOF9+5fCZ2W5BAcINKHY0vmM79L+2iqqk9%0A+AHNTqS9vf1W9bh7eWlbBedPz6Uww2gsJ92q9iBEoNhEpFj//nrcvVQ1tTMxy7eB3dTcFHJtrbTa%0AgrT58KSPakWg0WgioMHZRbe7/4WPlmo1ED3BuOikGK28IokTOGtV2mlMfMjNthjxgeUlqpHarJVX%0A4hZ25OG3+euu6oD7tHR086v3j/KZKdmsmh6iTsFxFBIy+OnNq3D3Su57bkfgzwuqE2lHc18Bmhfr%0AD9VR09LJDcsnepYltteorKgwFgFAfOY49cLC93eyuQN3r2RCpq9FEGO3URTfTl1vSuAdY+JUh1Vt%0AEWg0Gqt09ri5+JEPuPqxj/oPSG+uVPEBM+VyQBaBtWKyT8saKM5OIi/NaPqWkI4o/gyXJ+zhe6/s%0AodzR3+f9mw+O0eDs4rufnYUIlhYKqoYgeyolOcn8v8/PY1t5I4+sPRx42xATy57fcoKclHhVSWyQ%0A7DRy+kNkDJlkZmXjIsHS91fhlzrqTb69jaquJNy9QWI1IzyTQCsCjWaMsa28kQZnF/uqW7jqfz9i%0AX7XX8BOzmMxkIBZBW13Y+ICUkq1lDSwt9m2rbJt+KcU9ZRSKeu59bgddPX138Seb23lqQylXLSxk%0AXlF6aBkcfQPrr1xQyI3LJ/L4+mOsOxigNVn2NGXBVPpWGJ9q7uD9g7Vcu7SIWHvfpU4pAgG5s0LL%0AgJpUVtObTk9zYAvHG//UUW/SZAt17hTfFhrexI1sK2qtCDSaMcaGI/XE2AQvfuVshIBrn9jEeweM%0AO1bvYjJQwdTYpAFYBKHjA8fq2mh0dbN8kl8WzLRLAfjlohp2Vzbzn2/39QF65J3DSAn3r54R+vzd%0A7arhXHZfj6EHr5jN7HFp3Pf8Do75X0xtNhi/tF/m0JptFfRKuH6p71CblLYyyJqkRm6GoTAjkVoy%0A6W4+GXbbEw0uYmyCcel+bbGlJKG7iQbSfGcYexOf0i99VErJlrKGvsE2UUQrAo1mjLHhSB2LizNZ%0AVpLF6/esZEpuCnf9aSu/33AU2XrS1yIQQlkFkcYIwtQQfFqqCriWlfgNWsmZBpklzGn7hFvPLubJ%0ADaW8f7CGg6daWLO9kts+UxzwjtmHhuPq2UsRJMTa+e2tS4iz27jrT1tp6ej23adomWoZYVxMe3sl%0AL2yt4OzJ2ZTk+I6GTHaWW4oPgKourpPWhthXNLZTmJFIjN3vstrZgujtps2Wxp6qIIogLrmfRbDp%0AmINrn9hF2VPSAAAgAElEQVTEllMDGCwUIVoRaDQWWX+otu/Oe4RwtHWyt6rFUxiVl5bAC19ZwSWz%0A83n8758gpBt36jjfnVILrFsE7m5VpRsmRrClrIGclDgm+V1kEUJZBaUf8L1LiplZkMq3X9zFv7+2%0Al9T4GO65IEylMPQ1m/PrOlqUmcRjNy/mhMPFN57b4etvL1qmCumqtgPw8TEHFQ3t3LDcb8RldzuJ%0A7acsZQyB6jdUKzOIcYXvlh8odRTwZGElZRawuzJIGm+AGMHTH5WSnRzHwrzATfGGEq0INBqL/Ofb%0Ah7jvuR042jpHTIaPjqn5ued6ZdwkxcXw+M1L+PoS5ZL47y0u3zvmlHxoDe/aALxSR0Mrgk9LG1hW%0AkhU44Dt9NfR0kFD5Mf9702I6unvZUtbIv1w4lYykuPAyeAbW928/fdbkbB66cg7rDtXx8DuH+lYU%0ALVHPRsD4uS0nSE+M5dI5fg3r6g4i6LVsEeSnx1MrM4h1u8KmdwYqJgM8bTeyc8exr7qFnkDZT34W%0AQWm9k/cO1nLzimLi7CGC6kOEVgQajQWklJTWO3F2uXnig2MjJseGw3WkJ8Z6ulqa2GyC2+fEArCu%0AOoZrHv/Yk8VCaoH14TThhtYD1U3tVDW193cLmRSfo+ISh99mal4Kv7x+AZfOyefWs0usyeA4ppSX%0A2YPHj1tWFHPTWSp4/PpOo4AuMRNypkPlVhqcXbyz7xRXLxpPQqzf3XSIYTSBiI+x0x5vfBchvsO2%0Azh4anF2BLQKXUq6F44vo7OnlSG0AhRKf6qNo/vBRKbE2G7esmNh/2yigFYHmzODkLlVINEBqWztx%0AdblJTYjhj5vKOdUcuno2Gkgp2XCknpVTswMXYhnFZN+/6RJONndw9WMf8fHRenVR7WqzVrDkNFwg%0AIWIEW8qM+oFJQRRBbAJMXqW6kUrJZXPH8ZsvLu1/UQ5Gw7GwzeYeumIOy0uy+Nc1u/smgBUtg4rN%0AbH/rj1woN3NXzl7Y/1ffx+E3cdviVAtri0gz8ypEnKDCkzGU2H+lS1lxkyaqi/qeQAHjuGSPa6i5%0AvZuXtlVyxYJC8lIjm8c8ULQi0Jz+NJTCb84jr/ajAR+itF6Z7d+7fBZSSv7n/SNDJZ1ljtW1caql%0Ag3OnBblbb6mCmATOmj2VV7++krTEWG7+3WbeKjd86VasAguuoU9LG0iJj2FmQeA7dkDNKGiugNoD%0A4c/pj+No2At1XIyNx25ZTHZyHHc/s5W61k4oORfaG7h4z/38Ju5Rxr9zN7z4Rd/Hgb/Rmjo16DCa%0AQMSkhS8q828/7YPxnRaNn0BqfAy7qwLECbxiBC9sOYGry82XzimxLONgierweo1mVGAEH5NcVQM+%0AhKkIzp2Ww/XLJvD8pxV85bwpTMwOn4I4VHx4WF1QzpkaJLXTrCEQgql5Kbxx7zk8+Po+nt2xm8vi%0AoP5kOTkB/O4+ePoMBVcEW8oaWDQxo392jDfmsJojb1sq3PLQ0azcU+HaTwM5KfH89talXPPEx3z9%0A2W08++XrOMwU7n9hG9+8eCqXzRkXcL89e09wrnWJSMweD5UgW08RzFsfqpgMlwPs8djiU5gzPo09%0AVS39t4lLAXcXPV0d/PHjclZMzmJOYZhaiyFEWwSa058mVUma0BE+8yMYZfVO4mJsFKYncu+F07Db%0ABI++G6TSNUpsOFLHpJzk4OmXfjUESXEx/Oe1C7jj0hUA/GLNB7yzL0wapLMO7PFB/fONzi4O17R5%0A2koEJX08FMyDwxEOqwk3ntKPuePT+cU1C9hS1siDf9vP7w/HUxE7iXPPuQAK5gZ8uGMiU96Z2Xl0%0Ayhi6GoMH3CsaXKQmxJCeGNt/pcuh6jKEYH5RBgdOtvgU2gGeucXv7y6lqqndpzX3cKAVgeb0p+kE%0AMDhFcLzeSUl2EjabID8tgds+U8KrO6s4UjM8PeQ7e9x8cryBc6eFKPQyZxX7ceHS+QBMTXJy9zPb%0AeOiv+/rmAvtjjqgM0v5hW7lRPxAsPuDNtEuhYnPYoTE+BKghCMeVCwr52qopPPfpCV7dUcmVCwtJ%0Ajh86Z8e4jCTqyKC9MXh18YkGFxMykwJnUbkckKS+r3nj0+nq6eWw/+/G6ED68ieHKM5O4qJZ+UMm%0AvxW0ItCc/ngsgoHXAJTVOynJ7suZ/+r5U0iOiwne/2aI2V7eRHu3O3h8oLcXWk4GVAQkZYEtli8t%0ASOTL50ziDx+X8fnHPu5foQth+wxtKWsg1i5YOCFIJ01vpl+qAvRH3wu/rYnjKCAgM7I74vtXz+DC%0AmXn0Srhh2dBm2hRmJFAnM3C3BLcIgtYQgIoRGANp5hutNfoVlhkdSI9X1XD7Z0qCd2WNEloRaE5/%0ADIsgvtOhCqYixN0rKXe4mJTbpwiykuP40jmTeHPvqcBZIEPMhiN1xNgEKyYHuRN31kFvt297CRMh%0AILWAGGct//5Ps/ndbUs52dzOFf+zsS/90qStNnSguKyB+UUZ1jKAxi9RoxkjmWXsOAbpE1TmUQTY%0AbYLHbl7Ma/esZIEVJRUBhUZRmd0Z+Eait1dS0dgePF7kcnhGVE7MSiItIaZ/qwlDEeTGdXOtX0uM%0A4UArAs3pT9MJiElQhUQtkQeMq5va6XL3Minbt4r2znMnkZEU61vYFCU2Hq1n0cQMUhMC+KDBayBN%0AAEUAPm0mLpqVz5vfOI85hWl8+8VdlNV7tTZw1ged2tXe5WZPZXPw+gF/bHaYejEcWQu9FlN3HUcj%0Acgt5kxBrt2apREhuSjz1ZBDfURdwfV1bJ109vcFjN2aMABBGnGCPX+ZQfbf6u35uRorPmMvhImqK%0AQAgxQQixTgixXwixTwjxDWP5Q0KIKiHETuNxebRk0Gjocqm75QlnqfeGdRAJZsaQfzuFtIRYvnr+%0AFD44XOfJrY8GDc4u9lQ1B3cLQf+BNP74tZkoSE/g1zctJsYu+KUZ9JYypGtoR0UjPb2yf6O5UExb%0ADe0NIUdJepDSqCEYmCKIFjabwBWfS2JPC/T0ryr3dB3NDFBD0NMFnS0+Q+vnFaVz6FSrT5zmbwdV%0AJtGl00Kk5EaRaFoEPcC3pZSzgRXAPUIIM4/sl1LKhcbjH1GUQXOm01yhnicZCYMDUARljsCKAOC2%0As0vITY3nP986FLUukR8drUdKwgSK/WYV+xOg8VxeWgJ3rJzEX3dVc+Bkiwrq9nYHbUG9pbQRIWBJ%0AsUWLAGDqRSDsKo00HC6HSh+1mDE0nLjN7yRALcYJR5jUUfBRBPPHp9Ptlhw6pQLG7V1uXt6jLISc%0AuMhdl0NB1GwQKeVJ4KTxulUIcQAI8ivVWKKzFco/VkE4jTXMC//Es5HYEANQBMfrnCTH2cl1HoJ9%0AvkVpicBjkxt4c+8pjv1tA1Nzg0yh8iYpC+ZfHzQzx5+NR+pJS4hhflEIt0dLlZpy5XXB8SG1QF3o%0Aezp9Jo999bwp/PmTch5++xC/+5yRtx7EIthS1sCM/NTAKZLBSMxU1tjhd+CiH4TeNkizudGALbUA%0AmlFWVYZvMLqi0YUQMD6QRRBAEZizGHZXNbNgQgYvb6+kpiMGEhix4TSWFIEQ4hXgd8CbUsog8+JC%0A7l8CLAI2AyuBe4UQtwJbUVZDv/wyIcTdwN0A+fn5rF+/PtLTAtDW1jbgfUeCUPIWVr3J9CNPsGXp%0Af+NMKRlWuQIxFr7bwqq1TAc+PlTLorhMmg9u5qBYH9Exth/uICdB0vLcXaS39I8HLAOWxQLbIzhm%0AWSMt6cEHo5jfrZSStXvbmZ5uY8OHHwTdftaRHaTFZrL5ww8Drh9X3cQMYNO7r9OZ4HvHv3qC4OWD%0AtfzNvokrgJ3HqmlqWO+zjbtXsqXUxcrxMQH/5qF+C+Pj5jDtxJMc+fO3qSq6IuhnKDj5HjOBzUcd%0AtFcHPtZQMJDfbUOHcp7s/uRdGo75tovesr+TzHjBpo0b+u2X0biLhcCOI5U016lzSilJjYV3thxk%0AfPtxfr2xnZzUJOiGYwd2U+H0lW1Y/s+klGEfwMXAs8Ax4GfADCv7GfumANuAzxvv8wE7yi31E+Dp%0AcMdYsmSJHCjr1q0b8L4jQUh53/+JlA+mSfnBfw6bPKEYE9/t29+X8ke5UrrdsvGRFVL+7rKID3He%0AL96X9/x5q5Q/nSDl6/8ipaux3+OVj/fKeQ+8INduOxRwvefRVCHlQ5lSrn0o5DnN7/ZITassfuAN%0A+ewn5aGFfPqzoT/bobfUb+fEp/1WOTu75ZL/WCv/65c/U9uc2ttvm50nGmXxA2/I13dWhZQ3IO4e%0AKZ+7ScoH06Xc91rw7dY+JOUPs6Ts6Q6+zRAwkN/tS+u2SPlgmmz+4LF+6655/CN57RMfB95xzxr1%0Andbs91l86+82y0t/+YF8/2CNLH7gDfnqtgr1/bz34yGR1wTYKi1cpy3FCKSU70opbwYWA2XAu0KI%0Aj4UQdwghgtqJxrqXgWellK8Yx6qRUrqlsiyeBJZHqLvOXMw+MJGk453pNJ2AjAlgs9GRkBdxjKCr%0Ap5fKxnZmZ/Qo/3XuLEjM6Pe4YvkscnLz+Pn6k7jj0wNuQ2IGpBfBxLMt/w03HlGZKiHjA6BcQ+kh%0APK8hRlYmxcVw74VTaaozMo8CxAg8jeasZgx5Y7PDF56CoqXwyt1wYnPg7RqOQWYJ2Edf55vM3ELc%0AUuB09M86C1lDYLSgNusITOYXpXOkto3H1x0jLzWey+cXjujcYsvBYiFENnA7cCewA/hvlGJYG2R7%0AgXInHZBSPuK13LsByNXA3oilPlMx/Y2VW/p+YKOYB1/fy1t7I5iMFQ2aTkBGMQAdCfnQWq0yOSxS%0A0ejC3SuZHW9UJQcJZMbYbXzrkukcqW3j1R1hUlSnr4aavWrQfBg2HKmnJDsp9FQvKfvPKvbHM8Q+%0A8N/jxuUTmZTowo2N3oT+WUFbyhqYkJVIgf8YRqvEJsKNzysZn7se6gM07XOE7zo6UhRmpeAgne4m%0A36Kyjm43NS2doYvJQMVKvJg7Ph13r+TTsgZu+0wJcTE21WZiNCsCIcSrwAYgCbhCSnmllPIFKeW9%0AKNdPIFYCXwQu9EsV/YUQYo8QYjdwAfB/Bv8xzhBcDohPU5OYjr470tKEpLdX8uzmE7wW7qIYbZpO%0AeIJ7HQl56ruLoJbAzLEvEcYFNERq4+Vzx7FwQgY/e/MAza4Q2R/GXN9wVkFXTy+bjjtCp42C+l24%0Au4JnDIEKAAtb0A6kcTE2zi+EBpnCm/t98+WllGwta7RePxBUhhy45WWVRfTnL/Q1uANVGd1wfFQG%0AigEK01VRmb9FVdnYDgTJGAL1t0nM7GflmBXG8TE2blxuBJ/jkq21Co8CVi2CX0kpZ0spfypVNpAH%0AKeXSQDtIKTdKKYWUcr70ShWVUn5RSjnPWH6l//E0IXA5oOQc9U992EI63gjS6Oqip1dyNFAbg+Gi%0As00NBfFWBBCRe8isISjorgRbTL+MEW9sNsGP/3kuDc6u0EVmuTPUccI0ZNtxohFXl9uaWwhCWwQ2%0Au/rdhOipPynRRZs9k/9ae8hnitaxOicOZ9fA3EL+ZE2Gm15USuAv1/VN5Wo9Cd2uUVdDYJKWGIND%0AZBLrN7Iy5BwCUL+/AJlcBWkJTM5J5sblE8lKNqa2xaX0m1s8XFhVBLOFEJ7cNSFEphDi61GSSRMM%0AZ736Z556ibII3D0jLVFQaltV4U1ZvZPuQKP5hgOzhsCjCAw/udF7yAql9U4ykmJJaClTLiZ76NTJ%0AuePTufXsEv68uTz4fFqvub50Bx9ws+FIPXabYMWUICmhJuGKyUxS8kPOJBDOOlJzCjle5+SV7X1W%0AkxkfsNRozgpFS+Cap9WwoDVfUr/jBrPr6OhUBEIInHE5JHXV+yz3FJOFsggCKAIhBG9+81z+/Z+8%0AWnSPgRjBXVJKz69aqnTPu6IjkiYgUvb9qKavho4mz3zW0UidoQh6jD49I4J552/ECDrjs5V7JEKL%0AYFJOckT+62+tnk5OSjz/9upe3wHr3ky/VN0Bl20MepwNR+pYNCGDtGBtJUzMWEMo1xAY1cUhYjbO%0AOrLzxrNgQgaPvnvYU/m6pbSB7OQ4JgcoqBswMy+Hz/4CDr8Fb/5frznFozNGANCVmEuKu8mnXUZF%0Ag4uEWBu5KfGBd3I6+gWKTeJj7L7N5UZ7jACwC6/+qkIIO2BhCrVmyOhoUp0ck3NgyoXKTWGlWnOE%0AMC0CgKOBZrQOBx5FoCwCaYtRF8sIFEFZvZNJWUkRtT5IS4jl+5+bxZ6qZv6yOYj1UXIOxCQG/Ru2%0AdUl2h2srYdJSrX4PQSqCPYSxCHDWIVLy+NdLZ1Dd3MGzm9X39GlZA0tLMgO3WB4My++Cld+ErU/D%0AhkcgJgFSw1g1I0lKAXZ6++Y605cxFPS78WpBHZYxECN4C3hBCHGREOIi4DljmWa48KShZUNCukpB%0AHMVxgjovRRCw3fFw0FSuLi7e83czii0rgvYuN9XNHcxNc0Xsv75yQSErp2bzi7cPUdsawP0Tm6jm%0A+h5+S1l7fuxvcCMlnBMuPgBKEaQWgi3Mv3NqgdGlNEADuC6XuhtNzmHl1BxWTs3m1+uOcrS2lcrG%0AEIPqB8tFD8Lca5QbL2tK+M8wgpgjK7ua+uYSmHMIAmJa8ckW/oYwJmIEDwDrgK8Zj/eAf42WUJoA%0AmGloppk5bTXU7oemipGTKQS1rR0kx9kZl57AsZG0CDIm+rZyyJhoWRGUN6h/yplxxh1gBG4LIQQ/%0Aumound29/PQfBwNvNH21kqWuf2B5b72b1IQYFhRZGFcYZCBNP1LyVdaUM0AXTXOZYVX830tn0uDs%0A4hvP7wRCDKofLDYb/PNjMONymHFZdM4xRCRkq++4qUb9z0kpqWhwBY8PdLao3k3B2n74M9pjBFLK%0AXinl41LKa4zHb6SUFvvKaoYET88S4x/S7Dc0St1Dda2d5KUlMDUvZeQyhxrL+2f5ZExUd9AWaglK%0A65QiKMa4A4wwtXFKbgpfOX8yr+6oYtMxR/8NvOf6eiGlZF+9m5VTckLPBTYJV0NgEqqWwG9o/cIJ%0AGVw6J5991S0kx9mZPS4t/PEHSkw83Phc+F5EI0xGrpoT0FqvAumNrm6cXe7wNQRBYgT9iDcsgt7h%0AT66wWkcwTQixxmgpfdx8RFs4jRemIjDNzJzpys0R6UzYYaK2tZPclHim5KZwrLYtap05Q+JVQ+Ah%0AYyIg+zKKQlBqdB3N7a5SLqZwwdgA3HPBVCZkJfLvr+/tP6c2vQjy5/b7G5bWO3F0SGtuISvFZCYp%0AhiIIFCdwGmmRXrMIvr16BkLA4uJMawrpNCeroAiAjkYVnD8RamA9+LpzrRCXDEjlhhxmrP51fw88%0AjmotfQHwJ+DP0RJKEwCXeXdh/KiEgOmXQemH0N0+cnIFob61k9y0eKbkpeDscnOyOXiaZFTobFV9%0A8AMqAiy5h0rrnOSmxhPXdFzlvw/Af50Qa+dHV87laG0bT20McO80bTWc2ATtfammHx5WbprzrASK%0A2xuhp92akko10mcDWgS+riGA6fmpPHLdAr51yfTwxz4DKMzOoFGm0Nuivr/wqaOmlRWBawhGJE5g%0A9ZedKKV8DxBSynIp5UPA56InlqYfLofKMonzSuGbvlpdBEr7dz0caUyLwGzLPOyZQ2bsxEgd9ZBp%0AvLegCMocZurowKdmAVwwM49L5+Tzq/eOUNnod7dnzPWVx97n46P1fO3P2/iPvx+gMEUEH33ojZVi%0AMhNPv6EAFoFZ5evXgvrqRUUsmhjBIJrTmIRYOw6Rid2wnsIXk/VvQR0SjyIYfleqVUXQKYSwAUeE%0AEP8ihLia4K0lNNHAGaAwpfgciE0adXECV1cPbZ095KXFMzVP/UyGPXPIr4bAQ2qhanFgxSKodzI5%0AKwEaSgfd+uDBK+ZgE4KH/rrfZ3lz9kI6YtN557U/cdNTm9l03MGd50zi20ss9vQxi8nSi8JvGxOv%0A2h0EixHEp0U8K/hMozUm2zOysqLBRU5KPElxQZrkDSRGACOiCKy2+fsGqs/QfcB/oNxDt0VLKE0A%0AXI7+JmZsgpGC+A5cLi0POok2ZupoXmoCOSlxpCfGjoBF4FtD4MEeo7p0hlEELR3d1Ld1MS/VyPwY%0AZKFTYUYi37hoGj998yBr99dQmJHAnz8p57Ud1fyUOayK2c5/XTOPzy0YT0KsnfXrQ+T7+wgagUUA%0AKk4QLEZgNc3xDKYjIZfxzh2AkToazBoA9T9rj/e14kNhbjcCtQRhFYFRPHa9lPJ+oA24I+pSafoT%0ApGcJ01bDoX9A3UHICz7oZDgxi8lyU+MRQqjMoWFXBOXKlRbo4pZRHLbNhNlsbkaM2XV08K0PvnTO%0AJF7eXsk9f9lOV08vCbE2rlownsU5N5Cx/j6+UFALscF7GQWkpVpZOKbbJxyp+cFjBOEK0jS4k/LI%0AbG0EKTnR4GJJcQi3mVlDYPUGLc6YVzwaYwRGmug5wyCLJhSuIKXqZgriKCou67MIVNn9lNzkEXAN%0AlfevITCxUEtgNpubaKaODkHrg1i7jZ9/YT4LizL493+azebvXszPr5nPxOVXqNYXA/kbtlSrtFCb%0A3dr2wSyCtjptEVjAllZArHDTVH+Kk80dwTOGILKqYuizCLpaByfkALAaI9ghhPirEOKLQojPm4+o%0ASjaWqNkPG/4ruucIFCMA5ebInzeqhtXUtqgMoVxDEUzNS6G+rYsml/U5AIMmUOqoScZE1e2ypzPw%0AepQiEAKyOirVnVqQOb6RsmhiJi9+9Wy+fM4k0pOMHkJJWVC0fGCxnuZK624hUBZBW03/amZnrW8F%0AtiYgcZnquz5w5AjuXhl6ToSz3np8ALxiBKPQIjBIABzAhcAVxuOfoiXUmGP7H+G9H0XPt9fTqe4S%0AgqWhTV8NJz5RqYSjgLq2Tuw2QVaSakdlBoyj5R7q7ZX96xSaTvRlCPljBpBDDIYpq3dSmJ5ITKPR%0AYyja8Zfpl6punKGawgXCag2BSUqBml3g/Vtx96ic9yFSdqczqTkqTbe0VHVLDdpeAoJ2Hg3KCMYI%0ArFYW3xHg8aVoCzdmcBgtdAOV7g8F4dLQpqkURI69H53zR0htSyc5KXHYjM6KU3OV7zMa7qGWjm4u%0A/9UG/uONA30LO1rUhS6URQAh4wSerqMRNJsbFNOtDavxwVNMZiFjyCRQLYHLAUitCCyQkad+OzXV%0A6rcTMsU3kj5DMPrrCIQQvxdCPO3/iLZwYwazhW7UFUGQH1XRUkjMGjVVxnVtneSl9qUhjs9MJD7G%0ANuQWgbtXct9zOzh4qpWPjnr1ifebQ9CPMEVlUkpK651MzYpT2wxHa+S82eqCHkmcoKMZup2RWwTg%0AO2nLU0ymFUE4sgtUm4nu5pPE2gUFaUHSbXu6VK+hSCwCe6zKMhqBGIHV9NE3vF4noGYNVwfZ9syi%0Ap6vvghItReD0qyr2x2aHqRfD0bWqs6TVwGGUqG3p9Jlta7cJJucOfebQL946yPpDdUzNS+FYXRud%0APW7iY+yqxxAEVwSp41Tb5sbAFkGDs4uWjh7mJjeqJm3DoQiEUC6+XS+EjF34YHUgjTeefkNeAWNP%0AewkdIwhHTGIaThLIpZGizCTfeQLeRFpMZhI/Mh1IrbqGXvZ6PAtcBwQcUXnG0VSu3DLgO4N1KPHv%0AMxSI6Zeq7aq2RUeGCFAWge+gjim5yUPafO7VHZX85sPj3LJiIvddNE2NxTQVTbBiMhN76LkEZUaP%0AoekxxsVyuOboTrtU3eGXf2Rte48iiKAHkplm2uo1Idav4ZwmNE32bHJFE0WZYWoIIHJFMEIzCQba%0ASWoaoG8foC8+AH3/UEONlR/VlAsHnoI4hLh7JY62Tk/GkMnUvBQqG9s9U68Gw86KJh54eQ8rJmfx%0A4BVzmD1OxSAOnjRM6qYTEJsc+vvKDD6X4LjRdXS82yjWyp48aJktMek81dzOqosv0mIyUHeccam+%0AKaRB2ktoAuOKyyFPNIVJHTWVa4QpuXGpo7fFhBCiVQjRYj6Av6FmFGjM+IAtts/EHmpcDkCo9gDB%0ASMqCCWeNeLsJh7OTXkk/i2BqXgpS9l1kB0pNSwdfeWYreanxPHbzEmLtNkqyk4mLsXHgZIvaKFQN%0AgUmIWoIyh5MYmyCj44RSJqG+96EkLglKzrX+N2ypAkSfu8cq/kVlzjqwx6mBR5qwdCfmkUc4RTAI%0Ai2C0KgIpZaqUMs3rMV1K+XK0hRsTNBxTF4rM4ujGCBIzw/v+p62GU3v6XAYjQG1LX1WxN54U0kG4%0Ahzq63dz9zDZaO3p48talZCWr9NQYu40Z+akcPOVlEQSLD5hkFKuAaYDh8WX1atiIveH48M/QnX4p%0ANBwn0VUVftuWKuXqsYeZaeyPf1GZs05ZA6OkRcloR6bkK4sglGvIGSbBIxijOUYghLhaCJHu9T5D%0ACPHP0RNrDOE4qi4WyXmqOjMaWM1Hnm5MeBrB4rK6NlMR+GZTTMpJxiYGXksgpeR7r+xhV0UTj1y3%0AkFl+g1JmFqRy4GSLqidoCjCQxh9zfYBaguPeA+uHKz5gYlSKZzu2ht+2pVoVFEZKIItAVxVbJiWn%0AiCTRyYzsEIrTtAgitSZHKEZgNWvoQSnlq+YbKWWTEOJB4LVgOwghJqDmFuQDEvitlPK/hRBZwAtA%0ACVAGXCelHB2VUAPBcVwNIu92BRw5OCRYzUfOmwXpE+DIWlhye3RkCUNdi297CZP4GDsTs5IGPLby%0AqQ2lvLKjiv9z8XQum9vfFTJrXBovbaukvr6W3I5m64qgqQxy+u76pZSU1Ts5rzgJyqqHp4bAm8xi%0AyJ1Jbt1H4TPAWqohZ1rk5zAtAmk0Kmyr1X2GIqC4eDJsh8l/vRZi4gJv1FShlIDd6iXWYDTHCIJs%0AF+4T9gDfllLOBlYA9wghZgPfAd6TUk5DzT7+jlVhRx1dLmipVBZBSl50YwRWLAIhYPwS1YBuhOiz%0ACOL7rZsywBTS9Ydq+embB/js3ALuvTCwq8a0EMqPG8rYsiLwjRPUtHTS3u1mXpJxRzfcigDgrK+Q%0A3hvG3ckAACAASURBVHII3vl+6O1aqgc0NY3UfHXj0mm40pz1OnU0EiadB7OuUN9jYmbgx7j5cPY9%0AkR97hGIEVtXVViHEI8Cvjff3ACHzFKWUJ4GTxutWIcQBYDxwFbDK2OyPwHrGauC5wZg4lT1ZpZC2%0AN4K7O3KfbTic9VC0zNq2qQVw9L2hPX8E1LZ0kJoQQ0Js/zvZqXkpbDhST4+71/Low+N1bdz73A6m%0A56fy8LULPNXK/swyMofqK46oBcHaS5ikjlMBfj9FYDabm2IzXCfDHSMAWPolKne8T9Enj6k5A4Eu%0AKB0tqmApkowhE++RlfGpugV1pKSNg+ujNKAxPmVUu4buBf4d5dKRwFqUMrCEEKIEWARsBvINJQFw%0ACuU6CrTP3cDdAPn5+axfv97q6Xxoa2sb8L7hyKn7mLnA1tJm0loamQ58/O5f6YqPMFPAi37ySsn5%0AznpO1DsptfA5JtY6mdzVyofvvUWvPbpDRgJ9t/uOd5Bi7w34nfc0dNPl7uXlt9aTn2xNEfx2dyc9%0APT18eXoPWzZtDLltVoKg5tguADbuq6DncHNIec+Ky6bl8FYOxPQtW1/RDYC77BMAPtxXRe/BAIPn%0Ao0xbwXXEd9aT8/a/sb+iibq8lT7rk5wnWA7sr2ymNsLfd0bjKRYCOze8RWvqZM51d3H0VCuVg/g/%0Aieb/2VAzmmWdWFXL5N5uPnh/LdKmbiiHRV4pZVQfqElm24DPG++b/NY3hjvGkiVL5EBZt27dgPcN%0Ay4cPS/lgmpQdLVLue129rt45qEP2k9fVqI778f9aO8COZ9X29UcHLMOp5nb5wJpd0tXZE3K7QN/t%0AFx77SN7wm00Bt99W3iCLH3hDrt13ypIcHd09cu4P3pL3v2jtO7396c3y5Z/cIuVPCqXs7Q0v7x+u%0AkPLJi3wW/eTv++X0f/uH7H3lK1I+PNPSeaPBunXrpOxySfnUJVL+KFfKso99Nzjyrvo7l30U+cFr%0ADqh9d78kZd0R9Xrn84OXd4wwqmXd9Lj6ezgdnkWDkRfYKi1cp61mDa0VQmR4vc8UQoRNdhZCxAIv%0AA89KKV8xFtcIIcYZ68cBUXKsDwOO48rMjk/t87EOdQpppPnIoebSWuTVHVU8v6WCHScij+HXtvYv%0AJjOJNIV0w+F6Wjt7+Nz8cZa2nzUujdSOanrTJ1hLhQxQS3C8zklJdjKi4fjIxAe8iU2EG5+HjAnw%0A/I1Qd7hv3UCqik28G8+Zv9cUXUw2KvDMJBhe95DVYHGOlLLJfCNVlk/I6JIQQgC/Aw5IKR/xWvVX%0A+sZc3ga8bl3cUYb3UHOzKnOoU0jDNZzzx9NLJsJ2xl5sOqbOWeqILJ9ZSklda//2EiZpCbHkpcZb%0ADhj/Y89J0hNjWTnV2mefOS6N8dThTLToN88oVgqzu92zqMzhpCQnadAD64eMpCy4eY3qjfTsF/p6%0ABJmKINWakvQhIUM1N2s71ZfgoKuKRwfmTIJhjhNYVQS9QghPGobh85dBt1asBL4IXCiE2Gk8Lgd+%0ABlwihDgCXGy8H5t4tyg2/5GG2iLwNJyzNumoPd5USAOzCLrdvWwpawCg3OGKaN+2zh7au91BLQKw%0AnjnU2eNm7f4aVs/OJ9ZiYHn2uFSKRB01douVtmZAuUl1K3X3Sk44XMzMcCsFPBKB4kBkTYKbXlC/%0Ahb9cpy4SLVUq5TNY+mIohDBqCWq8Oo/qrKFRwQi1orYaLP43YKMQ4gNAAOdiBHKDIaXcaGwbiIss%0ASzha6WhW/0TmxSI+VfWJGeoUUisN57z4+iulPEkMMQO0CHZXNuPqUv2AzAwaq3hGVKYFVwRT81J4%0AbWcVUkpECPeN6Ra63KJbCKAkuYcY4WJzdzaWLuHeKaS506luaqfL3cvcBEP5DncxWSjGL4Frfq9c%0ARGvuUNlpA8kYMkkpUBZBWx0gIm+FoIkOHkUwvK2orbaYeAvVbfQQ8BzwbaA95E6nO2azOfNiIYSy%0ACoa68ZwrTAtqLyoaXKw7XE+NTKejcWBtJj45rhTP0uJMzwB3q3iG1qcEz1aampdCa0ePR2kE4x97%0ATpKWEMPKKdbTGmNa1J39vvaMMFsa+A2oOW583kliBFNHQzHjMvjcf6nK8ePrBhYfMPG2CJKyIi98%0A0kSHERpXaTVYfCeq+OvbwP3AM8BD0RNrDGAqAu+LRXLu0LeidjkgJrEviBSCNdtUu4Q6mUFzXfAx%0AjKH45LiDGfmpLC7OpLzBRW9vOA9gH1YtAjBaTTRXwRPnQplvWqjpFrp0TgFxMRE0yDUCv1saU/qP%0ArgxESoFPLYGp+Mb1VKlOrpkl1s89XCz9EpzzLfV6KCwCZ62OD4wmRmhcpdX/sm8Ay4ByKeUFqJqA%0AptC7nOY4jgLC92KRnBuFGIG1quLeXsmabZWcMzWHJnsW7paTYffxp6unl61ljZw9JZuS7GS6eno5%0A2dK/KVsw+iwCC4qgrk0pgFO74fmboLavGnrjkcjdQoDnzn6vK8NT4RwSm01l5BiKoLTeSXKcnaS2%0AMmUtDMT/Phxc9AO49Kew9I6BHyM1X7k3myq0IhhNxKnCyNGaNdQhpewAEELESykPAjOiJ9YYoOGY%0AuojEerlBUqKgCFyO4EPrvdh03EFVUzvXLZtAbPo4EjvrcUdwNw+wq7KJ9m43KyZnq8wZiMg9VNfa%0ASaxdkJEUvLI6LzWelPgYZRHU7lN35DEJ8Ow1YCivv++O3C0EQNMJemJTaCaZAyct+lgzJnoUSGm9%0Ak0m5yYiRaDYXCULA2V+H/DkDP4ZZXVx3UCuC0cQoTx+tNOoIXgPWCiFeB4JP/j4TcBztf7EwLQIr%0AbgmruOotWQQvba0gLSGG1bPzySqYSCat7D0RmVLadMyBELBichYl2eoHGUnAuLa1g9yU+JBBYCEE%0AU/KMzKGa/ZA7A256EVwN8Jdr6XQ2qWyhSN1CAE0nEBnFgOCgOZsgHBnFPhZBSVaScvuNtvjAUGOm%0AGfd06D5Do4nYROWWHI0xAinl1VLKJinlQ6hWE78Dztw21FKqYjL/i0VyHvT2qJ5DQ4XLEbaGoLm9%0Amzf3nuKqheNJiLUzcaKaqLVtb2TdUDcdczCrII2MpDgK0hKIj7FRHkEtQV1rJ7nBhnl7MTVXzRim%0Adr8a2l64EK77I9Tsp+2ZW2j//+2deXTc1ZXnP6+0q0qLrR1LtmS8YGOwsQwx2GAD2WDSCRA7gYSE%0ASUgzJ53pyTJ90uSkZ8hkJmfodKYz6elAloYO6UCIDQkJCQkdwDZgbLAN3jDe5E3yJqlkbZZUUqne%0A/PF+VSpJtfyqVKt1P+fUqdKvfvV7V3K5bt137/1ez5DtJrJxdJ8iZ+Yc6soKx4bURKN8NlzsYHiw%0An7YLAywp85iKjUzoIUgmriBlF9EZyhyUMpVDGZojCKC13qK1/p3WejgZBmUFFzvB0zP5wyLQS5DA%0AyiEbOYLf7z2Dx+tj/Yp6Y0aFqSY51HI40svGMTQyytunTH4AwOFQNFY4Od5pv5ego88TMT/gZ161%0Ai8Fet6mFr1lsDs7/AHzk+1Sce41/KHycVXNjLGfUOjCQ5oraoCE10bDmGp89eQSfhkUFVhR1qTuC%0A4Klm0kOQWeS7MnZrSAimK0TFEIy16Seql8BrfTuNkiPYsLONK2pLuGqWNTvIkhDobm+le8Cev97d%0A2o3H62Nl0AfwnIriwCB3O3T0eSJWDPmZV+1iobKqmmqWBI57lt7Lj/THuZPN5G/9B9vrAiYK8/RC%0A+WwW1ZVytL0fj9fGfGSrhLTrtFEtbVRWkj2TcwSJoLgSlKUQKzmCzCINUtTiCOLBP6d45oSh5onu%0ALrahM3T4fB97WrtZ11w/tjdvJQKr6Oa1I/aik20tbhwKrmsa62BuqnRyyj1gK+k8MurDfXHYdkSw%0A0GFq/qleHDj++pFOHvbcxdnGO2Hz/4Z3YpD69WsGlc/mirpSvD5tT8rCcgQXzhhHUDNy2iSwo80z%0AyHYcjrHcgOQIMos0jKsURxAP7haj/VI+QfPeH2InSm/Ihs7Qxp2t5DoUd14T1FzkrEKjmJ3Xy5bD%0A9mzZdszNlZeVUVY0VvHTWOlkeNTH2Z7ovYPufhN52IkIGmYUsdjRymBOybha+D/sO0tpYR4V9/wI%0A5t4Mz3/Z/myFgCOYw2JrNsFBO5VDrhp0TgHnTx3mfU0zKew5ZiQdos2HvhTw5wkkR5BZZEOOQMBE%0ABDOaJndjFs80Gf9ERQQXI3cVj4z6+M07p7l1UTUVwd/Ec3JRziqWlA6y5XBH1OaqoZFRdp/qDuQH%0A/Myp8JeQRs8TtPeZfgM7EUFujoOl+W205jYGVEID2kJX1pJfUAif+DlULYINn4Wze6NeMzgiaKxw%0AUpDrsJcwdjgYcl5GydBZ1q9oMMOGLvWKIT/+PIHkCDILyRFkCe6W0MlER4750E5UjiCKztCmg+10%0A9g/ziRUNk58sqaWpsI+OPg8Honwg7jp5geFRH9dPSNA2VZoSUjt5grGuYhvDcLRmrm7lgK8+cOj1%0AI530DXn5D1dZ1UKFpfDpDVBYBk+unyQXPYnuU1BQBkXl5OY4WFBjP2F8arSSOY5Obl9SbTmCSzw/%0A4Kf0MigohfzidFsiBCM5gizA54v8rTGRekNRcgQbd7VRVVLAmgUhkn0ltVRazd/Rtoe2H3OT41Cs%0AaJwx7nhNSSGFeQ5bTWX+ruJwEtTj6GmlyHeRnQN1gYTuH/zaQsGS06WXGQnmkUHjDCKV5VoVQ34W%0A1ZXw3tneqNHQRY+XPf2lNOW6KR48b+rqL/VEsZ/VX4VP/lu6rRAmIjmCLKDvDHgHJyeK/SRSb2jA%0ADSgzDHsCHX0eXjnYzl3LZ4We/+uqIW+gncV1pWw5FNkRbGtxs2RWGSWF4zuC/SWksUQElTa2hjh/%0AAID3fA2c6BwIbAt9YHGIJrKaxXD3L0wU9vS9ppIqFN0nxzmCK2pLcV8cjio18cK+s5zwVuIa7TZy%0AFzB9tobKZ8Pctem2QpiI5AiygFBic8G4qhObIyiaETJx+dw7pxn1adY3h9gWArP/e7GDtQtmsuvk%0ABfqGRkKeNjDsZU9b96RtIT9zKoptdRe39w0xozjPXjdw+7sAHNYNHG3vZ+tRa1vo6jBzBJpugjse%0AhZOvw3NfNFFZMEE9BH4W1ZUCRJWa2LirDY/L2qJq2WTup8vWkJCZ5Ltg5OLk93kSEUcQK/7S0XAf%0AFokUnhtwh8wPaK3ZsLOV5bPLAyJuk3DVgPZxa4MDr0+z9WjoAew7T1xgZFRPShT7aax00to1GLWE%0AtL03/IjKSZw/gK+0nn5VzNH2fn6/9ywlhbmsnhehnv3q9XDrQ7D/WXj5W+OfG7xg9lRnjFVxLQpU%0ADoXPj5x0X+St410sWnyVOdDyCuQVxzf1SxAShV9vaCR120PiCGKl65iRhS4JIwHsrDIfSsOxTfcK%0AyUDoruI9bT0cae83VS7hsCpCri4fwlWQGzZPsO2Ym1yHYsWcydtPAE0VpoT0THfkEtKOfg/VJTYS%0AxQDtB3DUXMms8iLeO9trTSKzoS20+quw4n7Y+gN466djxy3RuOCIoLw4P6rUxDO72nAouPHa5eZA%0AlyU2Z2fesSAkizTMJBBHECvuoyY/4Ajzp0tkU1kYR7BhZyuFeQ4+EkmPx2oqyxtoZ9W8Cl4NU0a6%0A/ZibpQ3lOAtCDyaZU2Gvcsh2ROAdhs7DULOYedUuXnrvfORtoWCUgtu+Cwtugz9+HQ6+YI4HlY4G%0As6iuNGzl0KhP8+yuNlbPr6KmbrZRQAXZFhLST37q5xaLI4iVcKWjfvxdmklyBIPDozy/+wy3L6mb%0AlNwdhyUzQd851iyo5nT34KRO236Pl71tPaycG34e8lgJafgIR2ttRQQ2HIH7iBHmq76SeVUuvD4d%0AfVsomJxcWPcY1C2DZz4PbTvHHEHZ+AjpitqSsFITb7R0cqZniPXN9cbB+F8rjkBIN4FxleIIMpNR%0AL1w4HrmqxL+nP1VHoHVIR/Diu+fo83hZt6I+zAst/F2j/edZs9B8yE7cHtpxootRn+b6ueE7S2tK%0AC6KWkPYOehn2+uxFBFbFkD8iAOxtCwWT7zTS1SU18NQn4firpt+gaPyIykURpCY27myjtDCXDyy2%0A/k7+aGK6VAwJmUsaZhKII4iFnlPm22ykb40BmYkplpAO9Zi1JiSLN+5qpWFmESuboqhz5haYiqO+%0Ac8wqL2J+tWuSI9je4iYvR9EcJj8AZn5AY4UzoiPo6Le6iu04gvZ3jTxHxXyurjcf3OPkMeziqoJP%0APwvaZ2b4TpT7IDhhPH57qGdwhBffHZPtBsYSzdOlh0DIXCRHkOFEKx2FxOUIQjSTtXYNsPWom3XL%0AG3A4bCQ0XbXQfx6ANQuqePNYFwPD3sDT2465uaZhBkX5kXV1ovUStPdaIyrtRgSVCyA3n8WXlbLj%0Am+9n9fw4tW4q58GnfmX290P0dYSTmnh+j5HtXtccFFVVzDNqnJXz47NFEBJFIEdgU0o9ASTNESil%0AHldKtSul9gcd+5ZS6rRSard1uz1Z6yeFgOpohG+NeYWmbT9hjmDsQ/KFfUYi+ePNNr9Bl9RA3zkA%0A1iysYnjUx/Zj5rq9QyPsP93DyjBlo8FEKyH1N23ZqhryD6OxsF1yGo6G6+AvX4EPfWfSU+GkJjbu%0AamNBjYur68vGDjb/R7j/z0YvShDSSf6lFRH8DPhwiOPf11ovs24vJHH9xONuMXo20dQanZVTdwQB%0AwbmxD6bXj3ayoMZF/Qyb2jBBEcG1jTMpyssJdBnvON6FTxO2kSyYxoriiCWktiOCoR7oaR0bRpMo%0Aaq6EstA5k4lSE0cs2e71zQ3jR2rmO6G+ObF2CUI8XEo5Aq31q0BXsq6fFtxHoWJu9DpzZ/XUcwQT%0ABOc83lF2nOjihlgGupfUGEegNYV5OVx/eUUgT7CtxU1+roNrZpdHuYiJCCD8/OKOfg8FuQ5KC0OX%0AoAZof8/cV09h6HqMTJSa2LirjRyH4o548hKCkAousYggHH+tlNprbR2Fz1JmIl02h5o7K6cuPDcw%0AXoL67ZPdDI34WD0vBkfgqoXR4YBY25oFVZxwD3Ci8yLbjrlZPrt8LFkaAX8Jabj5xe29Q1SVRB5a%0AD8B5Iy2R8IggAsFSE16f5tdvn+bmhdVT35IShGSRk2vyXinMEUT5CpdwHgX+J6Ct+/8DfD7UiUqp%0AB4AHAGpqati8eXNcC/b398f92mAco8Pc2N3KybIbOBHlevN7vVR1n+aNONb12zu3ZTezHPm89sYO%0AAJ49MoxDwciZA2z2f7OOQlW7myuBHa88z0XXHIoGjHbJD57byoEzI9wxL8/W30ZrTX4OvLb7EA2e%0AE5NsPdw6SKEm6rXmH/4zNTnFvP5OC6hjtn6HqdI/bLaE/rD1HWbkDNPZr1hU2J2Q90QySdT7NlVk%0Ak73ZYOsNqoCOE0c4snlzSuxNqSPQWp/3P1ZK/RT4fYRzfwL8BGDFihV67dq1ca25efNm4n3tONoP%0Awmuaxub303h1lOvpN+DMi6y9cfXk4TVRCNh74VfQUx2w/QcHtrKsAW57/yr7FzuRBwe+x7WLGuBy%0Ac51H3t3ES63DaOCeW1eMG00Zibl7XsVbVMTatddOsvU7b2/h8lona9euiHyRY38Pl13N2ptvtv87%0AJIDv7HqZ4aKZ7Dx7ngpnDv9l3S3khVJszSAS9r5NEdlkb1bYurucWZWlzFq7NiX2pvR/g1IqWBPh%0ATmB/uHMzjoDYXBj56WCcVYAe2+ePhwF3YGh979AIe9t6xmv128E/gaov4H9Zs6CKPo+XwjwHSxvK%0AwrxwMk2V4UtIbekMaW16CFK4LeRnUV0pO05cYHf7KHdcMyvjnYAgUFByaeQIlFK/BLYBC5VSbUqp%0A+4HvKqX2KaX2AjcDX03W+gmny+ohsNNwlIhegoHOQH7gzWOmAzhmRxDoLj4XOOTvMm6eM4OCXPtz%0AeU0J6QDe0fHSuB7vKN0DI9HlJXrPmKqh6tQ7gitqSzjdPciohvXROrIFIRPId14aOQKt9T0hDj+W%0ArPWSjvuo+YAvil5lM6Y3NIXKoQF3wOlsPdpJYZ69Cp9xFLhMBUJQRHD93EpmOvP5wKKamC7VWFHM%0AyKjmTPcQsyvGylc7raH1UZOv7X5pidRVDPnxJ4wbSx1cUVua8vUFIWbyXeaLU4pIdbI4e3G32Jcf%0A8MtMTKVy6OKYztDWo51c11QR0zf4AK6acRFBUX4Obzx4C/kxbo80BqmQBjuC9l4jL1FdGsUR+CuG%0AqhfFtG4iuLq+DKVgTb283YUsId8JvadTtpxsltrFbbN0FMYazuLtJfB6YLgPnBWc7x3iSHs/q2x0%0AAIekpHZcRABQmJdjT6IiiHCD7P0jKqtcUXIE59+F0lkhx24mmzkVTl762hrWNogjELKESyVHcEnh%0A6TPfqu0kisEoYebkx58jCNIZeqPFRBUx5wf8TIgI4qWqpIDi/JxJTWWBofXRIoIJ0hKp5vIqV/Q+%0AB0HIFFKcIxBHYIcuq+bdbkSg1NRGVgbpDL1+xM2M4jwW18W5tx0iIogHpRRzKpycnDCXoL3Pg1JQ%0A4cwP/+LREeg4lJaKIUHISvJdEhFkHHbE5ibirIp/a8jKLejimbzR0skNl1fGvJUTwFVjZp8m4NtF%0AU2XxJDnqjj4PFc58ciPlHNxHwTeSUmkJQchq8p3m/4zXk5LlxBHYwW1FBCGkjsOSgIig1ePkbM9Q%0A/NtCELKXIF7mVDg5NaGEtKNviEqXzUSxRASCYI8CM0sjVVGBOAI7uI+aRGe+TdVPMCWkcTsCo9W3%0A/ZyRR1g1L85EMYTsJYiXpgonXp/mdJAKaUefh+rSKIni9gOW1v+CKdsgCNOCFM8kEEdgh64oc4pD%0A4ZeiDjEwPioDnYDilZMj1M8oYvbMGBzQRAIRwdQdQWOI+cXtfR6qokYEB8zAl1wRehMEWwSkqCUi%0AyBzcR2MfYeisNsqf8TSFDLjRRTN441g3qy6vnFq1S9Ds4qnSaPUP+PMEPq3p7PfYqBh6N60VQ4KQ%0AdRSkdoC9OIJoDHQZGedYh5oHZCbiaCq72Iknfwa9Q15WxTvG0U/RDMgpSEhEUFVSgDOohPTiCIyM%0A6sgRgacPuk9JfkAQYiFfHEFmceJ1cx9rR6zL7wjiqBwacNOFSRbdEG8jmR+lxgbUTJGxElLjCHo8%0AZtsrYkSQhmE0gpD1BHIE4ggygzd/BGWzoWlNbK+bivDcgJvTHidX1JZEr8ixg6s2IREB+FVITY6g%0A23IEESMCqRgShNhJ8bhKcQSROLMbTm6F9/2nmOcKBPSG4ugl0BfdHBsomFrZaDAJiggA5lQUB1RI%0AezymjDRi1VD7AfPtpmx2QtYXhGmBlI9mENsfMR9iyz8T+2uLKwAVe45Aa/SAmw5fSWxjKSORwIig%0AsXKshNS/NRRRefT8AbOt5pC3miDYxh8RSPlomuk9C/t/Ddfca7SDYiUnF4pnxpwjyPVexKG99FBq%0Ae3pYVEpqYKgbRoamfKmmoEH2PR5NcX4OroIw0ZJ/GI1UDAlCbOQWmt4biQjSzI5/AZ/XbAvFSxzd%0AxXkjvQCUVNTiDPcBGysuq5cgAdtDc4JKSLs9OvJAmr5zpuKqZsmU1xWEaYVSlt6Q5AjSx8gg7Hwc%0AFt4em6zERJxV0B+bI/AOmr6DhvoETtIqSZwjqHKZEtIT7gF6hnXkbaF2SRQLQtwUiCNIL3t/BYNd%0AcP1fTe06cchMdFzoBmBBU+PU1h5nh9VUloA8gVKKRmt+sYkIIiSKz1tTyWRrSBBiJ98p5aNpQ2vY%0A/ijUXgVzVk3tWnFsDXX1JMERJDAiADOt7ISVI4gcERyAkjqTKxEEITZSKEUtjmAiLa9Ax0FY+SWz%0ATzcVnFXg6Y0pSTvYb7aG8kqqprZ2MMWVJvGUsMqhYk51DTDojVYxJIliQYibfKdsDaWN7Y+YrZQl%0Ad039WjE2lZ3pHiR3pBevo3CsfCwROBxmmyoBCqRgIgKfpaUX1hF0txpHMKs5IWsKwrSjoEQcQVro%0AOARHX4Jrv5AYpUyXf4i9PUew9WgnFaoPnYytFFdNQmYSwJgKKRC+auitn5j75Z9NyJqCMO2QHEGa%0A2P6oEWhb8fnEXC/GiOCNFjdVjj5yE7kt5KekNqERgZ+QEYGnH3Y9AYs/CuUNCVlTEKYdl0KOQCn1%0AuFKqXSm1P+jYTKXUn5VSR6z7GclaP2YGumDP07D0k2aWQCLwO4IoMhOjPs2R8328frSTy3L7UMVT%0AFJoLRQIjgkpXfqCJLGTV0O6nwNNj8iyCIMRHCnMECepYCsnPgH8Gfh507EHgZa31w0qpB62f/zaJ%0ANthn17+CdxDe98XEXTNERKC15lzvEHtau9nd2sOe1m72ne6h3+MFoLKkzyR3E01JrbFj1Bu7btIE%0ATAlpMe+e7mXmxKH1Ph+8+SjMWgEN105pHUGY1hSUwMgA6NGkL5U0R6C1flUp1Tjh8MeAtdbjJ4DN%0AJNER/HHfWZ57z8Nr/QcinpfjG+Gv9z1CZ+l1/OptBw51EIdSOBRg3WttPsR92gxk8Vk/a8DnM/eh%0A+LqjmHf2HeTfuw/QemGAPa3dtPeZgdR5OYrFdaXctXwWS+vLWTa7nJIf9Vo6RQnGVQNo4wxK66Z8%0AubmVLs64+8hxTKisOvIidB2DdX835TUEYVpjFYzkjCZ/gH0yI4JQ1Gitz1qPzwE14U5USj0APABQ%0AU1PD5s2bY17suYMetrSNQNvxiOd9RG2lJLeTrw18jk2vtZgPfQjcB2zCVJROvI+0v3ZfrouOc608%0Adfo45QWKeeU5fLA+n7llDhpKHeQ5vEAn9HXStm+Ey0cHOdbey6k4ft9IVHR2chWwc8sf6C+JcchO%0ACG4s87FgoW/Sv8vS3d+hqKCCN9vL0An+HaZKf39/XO+jdJBNtkJ22ZstttadOcNCYKjXnXx7tdZJ%0AuwGNwP6gn7snPH/BznWam5t1vGzatCnyCT6f1j9eo/U/NWs9OhrylNFRn/b5fPEZ8NP3a/2z6K/I%0AhgAADM9JREFUv7B3bs8ZrR8q1XrHY/GtFYnWnebaB/+YsEtO+tue3WvWeO37CVsjkUR9L2QQ2WSr%0A1tllb9bYumeD1g+V6u1/+Le4LwHs1DY+Y1NdNXReKVUHYN3HMb4rwZzaDmfegZVfDCuV7HCo+OcG%0AO6vsS1EPWOclJUfgn12cmMqhkGx/FPKKofm+5K0hCNMFa25xzujUVYOjkeqtod8B9wEPW/e/Tepq%0Aezey4NAG6P11+HNO74LCclh6d3JscFVB21v2zh1wm/tk5Aj8g3LsVA4NdJnk+bV/CYWl9q7fdx72%0AbTR9A0WZUwwmCFmLlSPI9Q4mfamkOQKl1C8xieFKpVQb8BDGAWxQSt0PnAQ+kaz1AWg/QIV7B/RF%0AaQ5b87eJ7eQNxlllPuB9o+DIiXyuP3JIVPlqMLn5xsHYiQhe/0d44//B8dfg0xshJy/6a3Y+DqPD%0Aia26EoTpTGGZJSEfrhQlcSSzauieME/dmqw1J/H+h9iWu4a1a9embMlJOKtB+8y3bFeURrGBLnOf%0AjIgArEllUSICTz/s+rmR3z62CZ7/Mnzsh5F1l0aGzPyG+R+CyqknogVBAOqWwt8cojsFiW3pLE42%0A/m/3drqLBzrRqORtrZTURI8Idj9pmsHu+qmJlHY/CZsfjvya/c+Y/MZUZbsFQUgLqc4RTD8CekPt%0AQBQlzgE33lwXedG2kOK2pRY6Dod/3uczCd/6a6F+hRGM62mDLQ9DWX3o2c1aw7ZHoPpKaFqTHLsF%0AQUgqEhEkm0B3sY3KoYudDOfHMR/ZLiU1ZiaBDrPnePhPcOE4rLS+2SsFf/EDuPwWs0V09KXJrzm+%0AxUwiW/nFqct2C4KQFsQRJBubekMADLgZybNZpRMPrlrwjYzlIiay/REorYdFHx07lpMH658wcwU2%0A3Adn90x4zaOm3PWq9cmzWxCEpCKOINkUzQBHrs0cQZIdgb+XoO/s5OfO7oUTr8H7HpisRVRYaqqH%0ACsvhyfXQfQqAooHTJoq49n7IizCyUhCEjEYcQbJRymoqy5CIAEInjLc/CnnO8PMDSuvg3mdMhdAv%0A1sHgBerbnoecfFhxf/JsFgQh6YgjSAV2uou1TmFEMKGEtO+8qfxZ9qnIFUvVi+DuJ00e4am7qT33%0ACixZN3ZdQRCyEnEEqcBZFT1H0N8OPm96IoKdj8HoiEn4RqPpRrjjUWjdTo7PIyWjgnAJIOWjqcBV%0ADZ1Hwj/v9cCz94Mjj+7yJcmzI78YCkrHRwQjQ7DjMVjwYai43N51rloHIwOc3L2FObVXJcdWQRBS%0AhjiCVOCsNDkCrSeXWGoNv/3PJlF750/ov5DkbZaJIyv3bTTNYHaigWCWf5bjvbOZk1jrBEFIA7I1%0AlAqc1eAdCj127uVvw74NcMt/M2Myk03wyEqtTZK4Zgk03ZT8tQVByEjEEaSCcL0EOx4zAm/Nn4Mb%0A/2tqbAmOCALNYH8lzWCCMI0RR5AKXJNnF3Poj/DC35i9+du/l7oPYn9E4JeGcFbBko+nZm1BEDIS%0AcQSpYOIQ+9O74JnPG3XBdY9PeZh8TJTUgncQzrxt5gtf+wVpBhOEaY44glTgHwrT324Guz/5CeMc%0APrUheXMQwuEvIX3pW1Yz2OdTu74gCBmHVA2lAr8Udcch2PZD0KNw77NjyqSpxN/8dfxVWHZvemwQ%0ABCGjEEeQCnLyTMfuWz+GnAK473dQOT89tvgjAoi9ZFQQhEsS2RpKFc4qQMHHfwqzV6bPDn9E0HQT%0A1CaxeU0QhKxBIoJUsfqrRoV08cfSa0dBKdz8TVh4W3rtEAQhYxBHkCqWfSrdFhiUgjVfT7cVgiBk%0AELI1JAiCMM0RRyAIgjDNEUcgCIIwzUlLjkApdQLoA0YBr9Z6RTrsEARBENKbLL5Zax1lbJcgCIKQ%0AbGRrSBAEYZqjtNapX1Sp40APZmvox1rrn4Q45wHgAYCamprmp59+Oq61+vv7cblcU7A2tWSTvdlk%0AK2SXvdlkK2SXvdlkK0zN3ptvvnmXra13rXXKb8As674a2APcFOn85uZmHS+bNm2K+7XpIJvszSZb%0Atc4ue7PJVq2zy95sslXrqdkL7NQ2PpPTEhEEo5T6FtCvtf5ehHM6gJNxLlEJZFMuIpvszSZbIbvs%0AzSZbIbvszSZbYWr2ztFaV0U7KeXJYqWUE3Borfusxx8Evh3pNXZ+kQjr7dRZVJWUTfZmk62QXfZm%0Ak62QXfZmk62QGnvTUTVUA/xGmYlcucBTWus/pcEOQRAEgTQ4Aq31MWBpqtcVBEEQQjMdykcnVSRl%0AONlkbzbZCtllbzbZCtllbzbZCimwN+3JYkEQBCG9TIeIQBAEQYiAOAJBEIRpziXtCJRSH1ZKHVJK%0AHVVKPZgmGx5XSrUrpfYHHZuplPqzUuqIdT8j6LlvWPYeUkp9KOh4s1Jqn/XcPymr7CrBtjYopTYp%0ApQ4opd5VSn05w+0tVEq9pZTaY9n7PzLZXmudHKXUO0qp32eBrSesdXYrpXZmsr1KqXKl1DNKqYNK%0AqfeUUtdnsK0Lrb+p/9arlPpKWu2103WWjTcgB2gB5gL5mA7mxWmw4yZgObA/6Nh3gQetxw8Cf289%0AXmzZWQA0WfbnWM+9BawEFPBH4LYk2FoHLLcelwCHLZsy1V4FuKzHecCb1poZaa+1zteAp4DfZ/J7%0AwVrnBFA54VhG2gs8AXzBepwPlGeqrRPszgHOAXPSaW/SfsF034DrgReDfv4G8I002dLIeEdwCKiz%0AHtcBh0LZCLxo/R51wMGg4/dgNJqSbfdvgQ9kg71AMfA28L5MtReoB14GbmHMEWSkrda1TzDZEWSc%0AvUAZcByr+CWTbQ1h+weBrem291LeGpoFtAb93GYdywRqtNZnrcfnME12EN7mWdbjiceThlKqEbgG%0A8y07Y+21tlp2A+3An7XWmWzv/wW+DviCjmWqrQAaeEkptUsZEchMtbcJ6AD+1dp2+xdlVAsy0daJ%0A3A380nqcNnsvZUeQFWjjyjOqhlcp5QKeBb6ite4Nfi7T7NVaj2qtl2G+bV+nlFoy4fmMsFcp9RGg%0AXWu9K9w5mWJrEKutv+1twJeUUjcFP5lB9uZitl8f1VpfA1zEbK0EyCBbAyil8oGPAhsnPpdqey9l%0AR3AaaAj6ud46lgmcV0rVAVj37dbxcDafth5PPJ5wlFJ5GCfwpNb615lurx+tdTewCfhwhtq7Cvio%0AMtP5ngZuUUr9IkNtBUBrfdq6bwd+A1yXofa2AW1WNAjwDMYxZKKtwdwGvK21Pm/9nDZ7L2VHsAOY%0Ar5Rqsjzv3cDv0myTn98B91mP78PsxfuP362UKlBKNQHzgbescLFXKbXSqgr4bNBrEoZ17ceA97TW%0A/5gF9lYppcqtx0WYfMbBTLRXa/0NrXW91roR8158RWt9bybaCkYcUilV4n+M2cven4n2aq3PAa1K%0AqYXWoVuBA5lo6wTuYWxbyG9XeuxNZiIk3TfgdkzlSwvwzTTZ8EvgLDCC+eZyP1CBSRoeAV4CZgad%0A/03L3kMEVQAAKzD/EVuAf2ZCYixBtq7GhKN7gd3W7fYMtvdq4B3L3v3Af7eOZ6S9QWutZSxZnJG2%0AYqrt9li3d/3/fzLY3mXATuu98BwwI1NttdZxAm6gLOhY2uwViQlBEIRpzqW8NSQIgiDYQByBIAjC%0ANEccgSAIwjRHHIEgCMI0RxyBIAjCNEccgSAkAaXUWmUpjApCpiOOQBAEYZojjkCY1iil7lVmpsFu%0ApdSPLRG7fqXU95WZcfCyUqrKOneZUmq7UmqvUuo3fr14pdQ8pdRLysxFeFspdbl1eZca08h/0q8V%0Ar5R6WJmZD3uVUt9L068uCAHEEQjTFqXUIuCTwCptxNVGgU9juj53aq2vBLYAD1kv+Tnwt1rrq4F9%0AQcefBH6otV4K3IDpJAej3voVjJ78XGCVUqoCuBO40rrO/0rubykI0RFHIExnbgWagR2WlPWtmA9s%0AH/Ar65xfAKuVUmVAudZ6i3X8CeAmS49nltb6NwBa6yGt9YB1zlta6zattQ8j19EI9ABDwGNKqbsA%0A/7mCkDbEEQjTGQU8obVeZt0Waq2/FeK8eHVYPEGPR4FcrbUXo+L5DPAR4E9xXlsQEoY4AmE68zKw%0ATilVDYF5vHMw/y/WWed8Cnhda90DXFBK3Wgd/wywRWvdB7Qppe6wrlGglCoOt6A166FMa/0C8FVg%0AaTJ+MUGIhdx0GyAI6UJrfUAp9XfAvyulHBiF2C9hBptcZz3XjskjgJEG/pH1QX8M+Jx1/DPAj5VS%0A37ausT7CsiXAb5VShZiI5GsJ/rUEIWZEfVQQJqCU6tdau9JthyCkCtkaEgRBmOZIRCAIgjDNkYhA%0AEARhmiOOQBAEYZojjkAQBGGaI45AEARhmiOOQBAEYZrz/wGeqe+kX79t/gAAAABJRU5ErkJggg==
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
In this case though it did not reach the bench mark accuracy , it showed a substantial improvement and we can definitely evelaute its performance for larger number of epochs
Highest Accuracy Train Accuracy = 34.38% Test Accuracy=32.52%
Next, we will try the Hinge Loss without applying the reduce_mean in the cost with sigmoid activation function
Now we will try the loss function hinge loss ut without he reduce_mean function applied to the cost. Will also change the activation to Sigmoid.
Resuse the code from the Initial Model, and ensure the function named model_lenet5 has the activation function Sigmoid. Follow the comments to see the change
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation. Ste activation function to Sigmoid
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code from the initla model and repalce the cost function to Hinge_lossand remove the reduce mean function . Observe the comments clasely to make the change
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
# change the softmax cross entropy to hine_loss and ensure to remove the reduce_mean function
#4. then we compute the hinge_loss between the logits and the (actual) labels
loss = tf.losses.hinge_loss(logits=logits, labels=tf_train_labels)
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code for initializing the tensorflow session from the first model. Ensure num_steps=7001
train=[]
test=[]
display=[]
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:10.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
Removing the reduce_mean function byu applying the hinge loss had an effect on the accuracy. The testing accuracy did not improve over 10%. Moreover this also affected as the network did not plateau as well. Hence, the reduce_mean function was required to improve the accuracy. Though at this stage the Gradient Descent Optimizer worked out the best.
Highest Train Accuracy= 17%
Highest Test Accuracy=10%
Though the Softmax Cross Entropy with Logits worked the best, it is prospective to try Hinge Loss with an activation of the Tanh.
The Hingle loss with the reduce mean funtion with the Tanh Function works the best. Hence, it is a prospective parameter apart from Softmax Cross Entropy with Logits and the sigmoid function
Next, lets observe the effect of number of epochs on the CNN MOdel. Though it is eveident that we need to train the CIFAR 10 model further to improve accuraciesto 80 to 90 % , it will be intuitive to check if increasing the number of epochs indeed increses the accuracy.
The model was trained for 10000 . The model design is the same as the first model with only changes in number of epochs.
Start by using the functions to initialize functions for intilaizing the weights and layers with weighsta nd bias. Make sure the activation is set back to Sigmoid to avaoid errors
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Next, use the code of the intial model to intialize the hyper parameters. Ensure Number of Epochs to 10001. Follow the comments to see the change made.
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
# set the num_steps to 10000
num_steps = 10000
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Use the same code as the initla model to initialize the tensorflow session
train=[]
test=[]
display=[]
#number of iterations and learning rate
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with epochs', num_steps)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Lets plot the accuracy vs number of epochs to note changes
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
It is clearly observed that the accuracy increases with the increase in number of epochs. There are instances where the network is lost in various parts of the loss scenarios and hence we see a random drop in accuracy from 50% to 30% owning to random seed and the loss scenario. We also observe a uniform rise in in testing and training accuracy which is a good sign. There is definitely a scope to train the network to see a the network plateau.
But it is clearly visible that a rise in number of epochs will defintely cause a rise in accuracy and the network should eventually plateau
Highest
Train Accuracy = 60%
Test Accuracy= 43%
Lets select Number of Epochs - 15000. The model was trained for 15000 epochs.
Reuse the code for the initla model to assign weights and bias to the network
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code to initialize hyper parameters same as the initla model. Moify num_steps=15000. Follow the comments to see the change made
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
# Change num_steps=15001
num_steps = 15001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code to initialize the tensorflow session from the intial model.
train=[]
test=[]
display=[]
#number of iterations and learning rate
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with epochs', num_steps)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Let plot the graph to see the accuracy
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
In Comparison to the 10000 epochs , there is some stability in the testing and training accuracies in 15000 epochs. This is because in the case of the 10000 epochs in the last epoch there is a drastic diffrence in accuracies between the training and testing accuracies. In comparison to that there is some stability in this case. The network yet looses its way in the loss scenario at around 6000 epochs and around 8000 epochs again with noted change in the accuracy.More iterations might see the network inprove its accuracy. From the references , it is clearly visible that CIFAR 10 can reach an accuracy of around 80% by training it to 150000 epochs.
Highest Train Accuracy =60%
Test Accuracy =48%
though similar accuracies as the previous tuning results but with stability between the train and test accuracies
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Intilaize the hyer parametr from the earlier model and change the optimizer to AdamOptimizer in the hyper paramete. Follow comments to observe the change made.
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
# Change to Adam Optimizer here
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Intilialize the session as per the previous model and run the model
train=[]
test=[]
display=[]
#number of iterations and learning rate
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with epochs', num_steps)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Plot the accuracy results vs the epochs
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
It is observed that with the Adam Optimizer there is no increase in test accuracy. The Training and Testing accuracies do not improve comsistently hence, the network would plateau after a considerable number of epochs . But considering the 7000 epochs benchmark with the Gradient Descdent Optimizer there was no improvement in changeing the optimizer to Adam Optimizer.
Highest Train Accuracy = 17% Test Accuracy=10%
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code to set the hyper parameters from the intial mode. Change the Optimizer to Adadelta. Observe the comments to make the change
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
# change the optimizer to tf.nn.AdadeltaOptimizer
optimizer = tf.train.AdadeltaOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
#Create lists to store the values forplotting
train=[]
test=[]
display=[]
### running the tensorflow session
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with epochs', num_steps)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
TRain Accuracy= 36% Test Accuracy= 28%
Though the Stochastic Gradient Descent Optimizer perfoemd the best , Adagrad Optimizer could be a promising parametr to tune. Additonally the Adam Optimizer did not seem like a desirable gradient Estimation value to tune
I have tuned parameters to change the connection type. In this case , there is a change in pooling . Average Pooling has been replaced by Max Pooling. Also incorporated dropouts which would let go of 50% of neurons from the flatten and fully connected layers.
There is a slight diffrence the weights and bias that have been assigned for the model. We have incorporated drop outs in the layers as well. Follow the comments to observe the change
LENET5_LIKE_BATCH_SIZE = 32
LENET5_LIKE_FILTER_SIZE = 5
LENET5_LIKE_FILTER_DEPTH = 16
LENET5_LIKE_NUM_HIDDEN = 120
# Create the function as before , rename it for convenience. The weights and bias remain the same
def variables_lenet5_like(filter_size = LENET5_LIKE_FILTER_SIZE,
filter_depth = LENET5_LIKE_FILTER_DEPTH,
num_hidden = LENET5_LIKE_NUM_HIDDEN,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth]))
w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth, filter_depth], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth]))
w3 = tf.Variable(tf.truncated_normal([(image_width // 4)*(image_width // 4)*filter_depth , num_hidden], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden]))
w4 = tf.Variable(tf.truncated_normal([num_hidden, num_hidden], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden]))
w5 = tf.Variable(tf.truncated_normal([num_hidden, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
# Here average ooling has been changed to max pooling.
def model_lenet5_like(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.nn.relu(layer1_conv + variables['b1'])
layer1_pool = tf.nn.max_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='SAME')
layer2_actv = tf.nn.relu(layer2_conv + variables['b2'])
layer2_pool = tf.nn.max_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
# Introduced drop outs , ie in different iterations 50% of the neurons will dropped out
#from the flat layer and the fully connected layer
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.relu(layer3_fccd)
layer3_drop = tf.nn.dropout(layer3_actv, 0.5)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.relu(layer4_fccd)
layer4_drop = tf.nn.dropout(layer4_actv, 0.5)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
HYper Parameters for tuning the model remain the same. This code can be reused from the inital model.
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5_like(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5_like
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.AdagradOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code to start the tensorflow session and use the code from the inital model
train=[]
test=[]
display=[]
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Plot the graph as the initla model
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
Hence, by using 7000 epochs as a bench marck this network architecture did not improve in given time span. Additionally, the network would have to be run for a longer time to observe if the network would plateau. But observing the accuracies this would definetly lower the chances of the network to plateau
Train Accuracy= 17% Test Accuracy=10%
In order to understand the imporatnce of the convolution layers, I eliminated the Convolution layers and incorprated only the flattened layer and the fully connected layer. Fully Connected Layers - This neural network is connected as fully connected layers. Hence, only the flattened layer has been incorporated. This implies that there is only the fully connected layer.
Lets observe the structure as below. Follow the comments in the comments to observe changes
train=[]
test=[]
display=[]
import tensorflow as tf
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the dataset
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 100
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth),name='tf_train_dataset')
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
#as a default, tf.truncated_normal() is used for the weight matrix and tf.zeros() is used for the bias vector.
weights = tf.Variable(tf.truncated_normal([image_width * image_height * image_depth, num_labels]), tf.float32)
bias = tf.Variable(tf.zeros([num_labels]), tf.float32)
#3) define the model:
#A one layered fccd simply consists of a matrix multiplication
# Got rid of the COnv2D Layers and only incorporated the flattened layer
def model(data, weights, bias):
return tf.matmul(flatten_tf_array(data), weights) + bias
logits = model(tf_train_dataset, weights, bias)
#4) calculate the loss, which will be used in the optimization of the weights
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5) Choose an optimizer. Many are available.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimbbize(loss)
#6) The predicted values for the images in the train dataset and test dataset are assigned to the variables train_prediction and test_prediction.
#It is only necessary if you want to know the accuracy by comparing it with the actual values.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, weights, bias))
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized')
for step in range(num_steps):
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction],feed_dict=feed_dict)
if (step % display_step == 0):
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
plot the graph as done in the previous examples
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
Hence it is observed that using fully connected layers improved computation speed but did not improve the accuracy significantly. Hence, proving the imporatance of the Convolutional Layers.
Trainig Accuracy =39% (but no stabily and drastic rise and fall) Testing Accuracy= 32% (no stability)
It is observed that the Network Architecture plays an important role in the network performance. Other prospective parameters to change would include number of convolutional Layers, changing the pooling function with the activation function to name a few.
For Network Initialization, I have used the Xavier Network Initialization and the random_normal initialization which is a kind of Gaussian Initialization.
Lets first have a look at Xavier Initialization.
It helps signals reach deep into the network.
Xavier initialization makes sure the weights are ‘just right’, keeping the signal in a reasonable range of values through many layers.
Lets see how to implement the Xavier Initialization
Reuse the code from the previous model and weight initialization and assigning it to the layers. Incorporate an additional parameter 'initializer' in the weights and bias for the variables_lenet5 function . Follow the comments below to see the changes made.
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
# add an initiaizer parameter to each of theb weights. Assign it as tf.contrib.layers.xavier_initializer()
w1 = tf.get_variable("w1",[filter_size, filter_size, image_depth, filter_depth1],initializer=tf.contrib.layers.xavier_initializer())
# w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.get_variable("w2",[filter_size, filter_size, filter_depth1, filter_depth2],initializer=tf.contrib.layers.xavier_initializer())
# w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.get_variable("w3",[(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1],initializer=tf.contrib.layers.xavier_initializer())
# w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.get_variable("w4",[num_hidden1, num_hidden2],initializer=tf.contrib.layers.xavier_initializer())
# w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.get_variable("w5",[num_hidden2, num_labels],initializer=tf.contrib.layers.xavier_initializer())
# w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code for the hyper parameter initialization
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code to run the tensorflow session from the inital model
### running the tensorflow session
train=[]
test=[]
display=[]
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
##3 Plot the accuracy vs Epochs sae as the inital model
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
It is observed that the Xavier Initialization definitely helped, as the testing and training accuracy are improved steadily with an almost constant decrease in Loss. Unlike earlier cases where the network woould get lost in various loss parts of the loss senario. It reached a similar accuracy of the benchmark 7000 epochs , where the training accuracy almost reached 40%.
Using this Weight initialization there is a chance of improving accuracy with increased number of epochs. Hence, tuning the weight initilization helped the network. Using this weight initialization might also use less number of epochs for the network to plateau.
Lets try the Gaussian Inialization to observe if there is a change in the accuracy of the model.
Reuse the code from the initla modelfor initializing the eights and bias. change the initializer from truncated_normal to random_normal. Random_Normal is a gaussain form of initialization. Follow the commenst to view changes in the tensorflow code
import tensorflow as tf
LENET5_BATCH_SIZE = 32
LENET5_FILTER_SIZE = 5
LENET5_FILTER_DEPTH_1 = 6
LENET5_FILTER_DEPTH_2 = 16
LENET5_NUM_HIDDEN_1 = 120
LENET5_NUM_HIDDEN_2 = 84
### Designing the weights and biases for the network
def variables_lenet5(filter_size = LENET5_FILTER_SIZE, filter_depth1 = LENET5_FILTER_DEPTH_1,
filter_depth2 = LENET5_FILTER_DEPTH_2,
num_hidden1 = LENET5_NUM_HIDDEN_1, num_hidden2 = LENET5_NUM_HIDDEN_2,
image_width = 28, image_height = 28, image_depth = 1, num_labels = 10):
# Each of weights set to tf.random_normal
w1 = tf.Variable(tf.random_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
# w1 = tf.Variable(tf.truncated_normal([filter_size, filter_size, image_depth, filter_depth1], stddev=0.1))
b1 = tf.Variable(tf.zeros([filter_depth1]))
w2 = tf.Variable(tf.random_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
# w2 = tf.Variable(tf.truncated_normal([filter_size, filter_size, filter_depth1, filter_depth2], stddev=0.1))
b2 = tf.Variable(tf.constant(1.0, shape=[filter_depth2]))
w3 = tf.Variable(tf.random_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
# w3 = tf.Variable(tf.truncated_normal([(image_width // 5)*(image_height // 5)*filter_depth2, num_hidden1], stddev=0.1))
b3 = tf.Variable(tf.constant(1.0, shape = [num_hidden1]))
w4 = tf.Variable(tf.random_normal([num_hidden1, num_hidden2], stddev=0.1))
# w4 = tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1))
b4 = tf.Variable(tf.constant(1.0, shape = [num_hidden2]))
w5 = tf.Variable(tf.random_normal([num_hidden2, num_labels], stddev=0.1))
# w5 = tf.Variable(tf.truncated_normal([num_hidden2, num_labels], stddev=0.1))
b5 = tf.Variable(tf.constant(1.0, shape = [num_labels]))
variables = {
'w1': w1, 'w2': w2, 'w3': w3, 'w4': w4, 'w5': w5,
'b1': b1, 'b2': b2, 'b3': b3, 'b4': b4, 'b5': b5
}
return variables
### Setting up the layers and activation
def model_lenet5(data, variables):
layer1_conv = tf.nn.conv2d(data, variables['w1'], [1, 1, 1, 1], padding='SAME')
layer1_actv = tf.sigmoid(layer1_conv + variables['b1'])
layer1_pool = tf.nn.avg_pool(layer1_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
layer2_conv = tf.nn.conv2d(layer1_pool, variables['w2'], [1, 1, 1, 1], padding='VALID')
layer2_actv = tf.sigmoid(layer2_conv + variables['b2'])
layer2_pool = tf.nn.avg_pool(layer2_actv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
flat_layer = flatten_tf_array(layer2_pool)
layer3_fccd = tf.matmul(flat_layer, variables['w3']) + variables['b3']
layer3_actv = tf.nn.sigmoid(layer3_fccd)
layer4_fccd = tf.matmul(layer3_actv, variables['w4']) + variables['b4']
layer4_actv = tf.nn.sigmoid(layer4_fccd)
logits = tf.matmul(layer4_actv, variables['w5']) + variables['b5']
return logits
Reuse the code for initlaizing the hyper parameter from the intial model
#parameters determining the model size
image_width = c10_image_width
image_height = c10_image_height
image_depth = c10_image_depth
num_labels = c10_num_labels
#the datasets
train_dataset = train_dataset_cifar10
train_labels = train_labels_cifar10
test_dataset = test_dataset_cifar10
test_labels = test_labels_cifar10
#number of iterations and learning rate
num_steps = 7001
display_step = 200
learning_rate = 0.5
batch_size=64
graph = tf.Graph()
with graph.as_default():
#1) First we put the input data in a tensorflow friendly form.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_width, image_height, image_depth))
tf_train_labels = tf.placeholder(tf.float32, shape = (batch_size, num_labels))
tf_test_dataset = tf.constant(test_dataset, tf.float32)
#2) Then, the weight matrices and bias vectors are initialized
variables = variables_lenet5(image_width = image_width, image_height=image_height, image_depth = image_depth, num_labels = num_labels)
#3. The model used to calculate the logits (predicted labels)
model = model_lenet5
logits = model(tf_train_dataset, variables)
#4. then we compute the softmax cross entropy between the logits and the (actual) labels
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
#5. The optimizer is used to calculate the gradients of the loss function
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
test_prediction = tf.nn.softmax(model(tf_test_dataset, variables))
Reuse the code for initializing the tensorflow session
### running the tensorflow session
train=[]
test=[]
display=[]
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized with learning_rate', learning_rate)
for step in range(num_steps):
#Since we are using stochastic gradient descent, we are selecting small batches from the training dataset,
#and training the convolutional neural network each time with a batch.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if step % display_step == 0:
train_accuracy = accuracy(predictions, batch_labels)
train.append(train_accuracy)
test_accuracy = accuracy(test_prediction.eval(), test_labels)
test.append(test_accuracy)
display.append(step)
message = "step {:04d} : loss is {:06.2f} , accuracy on training set {:02.2f} %, accuracy on test set {:02.2f} %".format(step, l, train_accuracy, test_accuracy)
print(message)
Plott he graph as done earlier
# Graph to see the network plateau
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure()
plt.plot(display,test,label='validation')
plt.plot(display,train,label='training')
plt.legend(loc=0)
plt.xlabel('epochs')
plt.ylabel('accuracy')
# plt.xlim([1,display_step])
# plt.xlim(display)
# plt.ylim([0,1])
plt.grid(True)
plt.title("Model Accuracy")
plt.show()
# fig.savefig('img/'+str(i)+'-accuracy.jpg')
plt.close(fig)
Train Accuracy = 45% Test Accuracy= 42%
Therefore Network Initialization Xavier and Random_Normal Initalization will contribute to the the network plateauing consistenty. This an imporatnt parameter to look at while tuning a CNN Model, especially to bring consistency between the train and testing results and steady the decrease in the Losses.
After performing the various parameter tuning the following is observed
The network currently works well with a learning rate of 0.5. Though it might make sense to try the a combination of learning rate and a diffrent activation function except sigmoid as there was no change with the Sigmoid Activation and learning rate.
Activation Function : Currently the use of the Sigmoid Activation function works best and appears to rpovide the highest accuracy in 7000 epochs
Loss :The Hinge Loss with activation function Tanh worked well and could be replaced by the Softmax function. But this would be a comparitive study to take up with the number of epochs
Number of Epochs : As mentioned earlier, the network must be trained for atleast 150000 epochs. There is a clear increase in accuracy as the number of the epochs increases
Gradient Estimation : The gradient descent Optimizer performs the best and contributes to the plateuing of the network considerable

The Loss Function, Hinge Loss, Number of Epochs and the Network Intialization (Xavier and Gaussian) playeed the most important role to improve training and testing accuracy and can be run for larger epochs to see its effects on the CNN. As per benchmark the above mentioned parameters outperformed the others clearly
Recurrent Neural Networks is a class of Neural Networks where connections between units form a form a directed graph along a sequence. RNNs can use their internal state (memory) to process sequences of inputs and hence they are popularly used for Handwriting Recognition. An RNN with a gated state or gated memory are part of Long Short Term Memory Neural Network.
LSTMs contain information outside the normal flow of the recurrent network in a gated cell. Information can be stored in, written to, or read from a cell, much like data in a computer’s memory. The cell makes decisions about what to store, and when to allow reads, writes and erasures, via gates that open and close. Unlike the digital storage on computers, however, these gates are analog, implemented with element-wise multiplication by sigmoids, which are all in the range of 0-1.
HasyV2 is a dataset similar to the Hello World Dataset MNIST for Handwriting recognition.This is a dataset of single symbols similar to MNIST. It contains 32px x 32px images of 168233 instances of 369 classes. In total, the dataset contains over 150,000 instances of handwritten symbols. This is used as a Hello world dataset for Handwriting recognition.
Since, this dataset consists of large number of images that would require additional computation power and a considerable amount of time to train , I have used a subset of the dataset that has been created bu Sumit Kothari. He has created the subset and stored the images and labels in numpy arrays using a subset of the data. The data can be found at following link below.
https://github.com/sumit-kothari/AlphaNum-HASYv2
Steps to use the dataset
[1] Download the dataset from the link below :
https://github.com/sumit-kothari/AlphaNum-HASYv2
[2] Download and store the dataset in the Jupyter Notebook path , where all other ipynb notebooks exist. The file path to store in are as follows :
The Jupyter Notebook path would look as follows :

[3] Files to Download
alphanum-hasy-data-X.npy alphanum-hasy-data-y.npy symbols.csv
[4] This will ensure that we can directly access the datasets
The RNN-LSTM structure utilizes weights and bias initialized as random normal. The simplest form of an RNN known as the Static RNN Cell and Basic LSTM Cell is being used. In the case ofthe HasyV2 dataset , the size of a image is 32X32 pixels. The RNN would compute through 32 rows of the image each time. The RNN would receive 32 time steps , where one row (32 pixels) would be input. This would result in a full image in 32 timesteps. A batch size for the number of images will be supplied, such that every time step would be would be supplied with the enumerated batch size of images.
The learning rate selected is 0.001 and a batch size of 128. The opitmizer to reduce the softmax_cross entropy loss is the Adam Optimizer that is selected.
First lets import all necessary libraries
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import urllib
import requests
from bs4 import BeautifulSoup
from pandas import DataFrame
import zipfile,io,os
import pandas as pd
from random import randint
from sklearn.model_selection import train_test_split
from tensorflow.contrib import rnn
from keras.utils import np_utils
The file alphanum-hasy-data-X.npy consists of images as a numpy array and alphanum-hasy-data-y.npy consists of the image labels in the form of a numpy array. The Symbols.csv consists of the list of symbols with the latex. The latex field contains the label of the image. The CSV file also contains the count of the training and testing sample for each symbol.
We load each of the files in this step
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced from https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data by Sumit Kothari
#which is public in the below 3 cells
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
Each symbol has a latex that signifies the character. In order to see which symbol id is associated with the each symbol the following function has been written
# Code referenced from https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data by Sumit Kothari which is public in the below 3 cells
def symbol_id_to_symbol(symbol_id = None):
#first we check if the symbol id exists, if it does not exist send a message , else provide its latex value to the symbol
if symbol_id:
symbol_data = SYMBOLS.loc[SYMBOLS['symbol_id'] == symbol_id]
if not symbol_data.empty:
return str(symbol_data["latex"].values[0])
else:
print("This should not have happend, wrong symbol_id = ", symbol_id)
return None
else:
print("This should not have happend, no symbol id passed")
return None
# test some values
print("21 = ", symbol_id_to_symbol(21))
print("32 = ", symbol_id_to_symbol(32))
print("90 = ", symbol_id_to_symbol(90))
print("115 = ", symbol_id_to_symbol(95))
In the above cell it can be seen that symbol_id 21 has no latex. The Symbol_id 32 has the handwritten digit 'B'.
Next, we are going to plot a few of the images , whose symbols are generated by the numpy random integer function. We use the random integer function to randomly generate images. Then we use the symbol_id_to_symbol function map a symbol images to its label
#plot images from the dataset
# Code referenced from https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data by Sumit Kothari which is public in the below 3 cells
f, ax = plt.subplots(2, 3, figsize=(12, 10))
ax_x = 0
ax_y = 0
# plot 6 random images from the dataset genrated by the randint library
for i in range(6):
randKey = randint(0, X_load.shape[0])
ax[ax_x, ax_y].imshow(X_load[randKey], cmap='gray')
ax[ax_x, ax_y].title.set_text("Value : " + symbol_id_to_symbol(y_load[randKey]))
# for proper subplots
if ax_x == 1:
ax_x = 0
ax_y = ax_y + 1
else:
ax_x = ax_x + 1
After visualizing the data lets split the data into training and testing using the Scikit Learn Library
# Split the data into training and testing
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Shape of Trained Dataset")
print(X_train.shape, y_train.shape)
print("Shape of Trained Dataset")
print(X_test.shape, y_test.shape)
# the cells below which represent the architecture and running of the network has code that has been referenced
#from https://jasdeep06.github.io/posts/Understanding-LSTM-in-Tensorflow-MNIST/ by author jasdeep06.
#The code below is based on jasdeep06 (no licence mentioned )
# this code has been modified further to adapt to the HasyV2 dataset
Lets define the network parameters. Since the image size is 32X32 pixel, the time_steps=32 and the n_input =32. The total number of classes in the subset are 116
hence,
n_classes=116 Next, we pre process the data and converting the labels to one hot encoded values.
Next we initilize the weights and bias for the model, following which we initalize the network parameters and define the place holders. Follow the steps below to create the network structure in Tensorflow.
# Defining the network parameters
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
#hasyv2 has 116 classes.
n_classes=116
#size of batch
batch_size=128
Since, the LSTM model would take one hot encoded values , we have normalized the images and one hot encoded the labels.
# Normalize the training set
# Code referenced from https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
#by Sumit Kothari which is public in the below 3 cells
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs which are the labels
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
#Lets check the total number of labels in the dataset
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
It is seen that the total number of labels in the dataset is 116
Weight Initialization for the network is as done below using a Gaussian dirtibution. If Weights and Bias are not initialized it was observed that for this dataset the training would attain very good accuracies but the testing accuracies would be extremely low. Hence, weight initialization plays a very important role for the RNN LSTM model. Next, we define the place holders where the input has a shape of 32X32 and the output has a shape of 116
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
Next, we process the input tensor which has a shape of (batch_size,n_steps,n_input) for the time_steps into a list using the unstack function. This will help in feeding the input as a list to the the static RNN cell.
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
Next, we define the Basic LSTM Cell and the staic RNN cell
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
We define the prediction as a matrix multiplication of the weights. The Loss function is softmax_cross_entropy_with_logits and using the Adam Optimizer to reduce the loss. We define the accuracy evaluation as well.
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
We define the a function next_batch which is similar to 'mnist.train.next_batch()' that is commonly used hile training the MNIST Data. This function helps to iterate through the batches while training the data. The mnist.train.next_batch() function is specific to Tensorlfow and the MNIST Dataset. Hence, the following function will help provide batch of images while iterating through all images for the HASYV2 Dataset.
# The code has been referenced from
#https://stackoverflow.com/questions/40994583/how-to-implement-tensorflows-next-batch-for-own-data?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa
#by author @edo
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
Next we initialize the tensorflow session and train the model
We test the model after the model has been trained for 800 epochs. We initialize lists namely train_loss, train_accuracy and epoch to store the values in the tensorflow session which will subsequently be used to to visualize the Loss and Accuracy vs Epochs. Follow the comments in the code to understand the steps implemeted in detail
# Store the loss, accuracy and epochs in a list to plot the network performance
# Number of epochs is selected as 800
train_loss=[]
train_accuracy=[]
epoch=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
iter=1
while iter<800:
#provide the images in batches
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
#reshape the training data for the tensor
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
# apply the optimizer to the training data
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
# whenever the number of epochs is completely divisible by 10, the training accuracy is printed with the loss and number of epochs
if iter %10==0:
epoch.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("__________________")
iter=iter+1
#print the testing accuracy
test_data = X_test.reshape((-1, time_steps, n_input))
# test_label = mnist.test.labels[:128]
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
Lets plot the training ccuracy by number of epochs and the Loss vs number of epochs
# plot train loss vs epoch
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Train Loss vs Epoch', fontsize=15)
plt.plot(epoch, train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Train Loss')
# plot train accuracy vs epoch
plt.subplot(1, 2, 2)
plt.title('Train Accuracy vs Epoch', fontsize=15)
plt.plot(epoch, train_accuracy, 'b-')
plt.xlabel('Epoch')
plt.ylabel('Train Accuracy')
plt.show()
We observe that Training Accuracy reached 94.53% and the Testing Accuracy reaches 68.59%. From the above graphs we understand that the loss decreases at every 100 epochs and the accuarcy increases consistently every 100 epochs. The Network performace is overall good for the above parameters taken into consideration. It is was also observed that without weights and bias the training accuracy was very good but the testing accuracy was very low. This indicates that Network Initalization plays an important role in a RNN LSTM .
Training Accuracy = 94.53% Testing Accuracy=68.59%
Lets try and improve the model by tuning various hyper parameters. The following hyper parametrs will be tuned number of epochs,batch_size,number of neurons, combination of learning rate and number of neurons, optimizer and activation functions
With the help of the LSTM Model we have achieved a training accuracy of 94.53% and a Testing accuracy of 68.57%. Next we will tune the following hyper parameters and observe the impact of each of these on the model.
We will tune the model with Number of Epochs - 500,1000,200. To do this the following step were implemented
Lets look at the code steps by step
Number of Epochs selected to tune are 500,1000,2000 epochs
We first reset the Tensor flow graph in order to reset all variables in the graph.
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
We reuse the code to load the data as from the Initial Model
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
We split the data into training and testing as done before in the initla model
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Let create functions plot_loss_epoch which will provide the loss vs number of epochs completed. We also create a similar function to plot accuracy.
#Functions to plot loss and accuracy vs number of epochs
# plot train loss vs epoch
def plot_loss_epoch():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Train Loss vs Epoch', fontsize=15)
#the list epoch_list and train_loss are initilazed in the session
plt.plot(epoch_list, train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Train Loss')
# plot train accuracy vs epoch
def plot_acc_epoch():
plt.subplot(1, 2, 2)
plt.title('Train Accuracy vs Epoch', fontsize=15)
#the list epoch_list and train_accuracy are initilazed in the session
plt.plot(epoch_list, train_accuracy, 'b-')
plt.xlabel('Epoch')
plt.ylabel('Train Accuracy')
plt.show()
Next, lets initilaize all hyper parameters, loss, optimizer and start the tensorflow session . This code has been reused from the previous model. In order to observe the models accuracy by number of epochs closely see the comments in the code below.
reset_graph()
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
# as we want to tune number of epochs as 500,1000,2000 we have created a list with these values.
#following which there ill be a for loop that will iterate through this list
epoch=[500,1000,2000]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
#for loop through iterate through each values elected as number of epochs
for e in epoch:
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
print("Number Of Epoch:",e)
while iter<e:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("TOTAL EPOCHS:",e)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
# test_label = mnist.test.labels[:128]
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
The above output will show us values of accuracy and loss for number od epochs as 500, 1000 and 2000 respectively.
It is observed that for 500 epochs , the model attains a trining accuracy of 79.68% and a testing accuracy of 64.80 %. The overall loss decreses as the number of epochs increase. Additionally, the training acuracy also increases as number of epochs increases
It is observed that for 1000 epochs, the model attains a maximum accuracy of 99.21%, with its very last iteration providing an accuracy of 93.75%. The Testing accuracy reaches a maximum accuracy of 70.02%.
We observe that in the case of 2000 epochs , the network almost plateaus. It reaches an accuracy of 100 % but its testing accuracy is 70.81%. We observe occassional spikes in the loss with the number of epochs , but the overall trend indicates that the loss decreases with the number of epochs. Additonally, it is also observed that, the model attains a 100% accuracy withen the first 250 iterations which could be as we have not set the random seed as well
Maximum Training Accuracy = 100% Maximum testing accuray=70.81% for number of epochs =2000
It is observed that for 200 epcohs the network almost plateaus. Hence,The total number of epochs can definetly be consisdered as an important hyper parameter for tuning the RNN LSTM model.
Lets tune the batch size next and observe the performance of the network. In the previous cases, we used a batch size of 128. We will observe if there is an impact if we reduce the bacth size and increase the batch size.
Batch sizes selected 4, 512.
Lets have a look at the code for the same. Maximum bits of the code have been reused from the inital model except the batch_size parameter
Reset the tensorflow graphs and load the data as the initial model
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Next, initialize the various hyper parameters. We create a list for batch size as we need to assess the performance of the network for a batch size of 4 and 512. All the code is reused from the previous model except the batch size. closely see the comments to see the change in the code
reset_graph()
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
# labels=116
n_classes=116
#size of batch
# batch_size is assigned a list wihth desired batch size 4 and 512. We will then iterate through this liss in the tensor flow session
batch_size=[4,512]
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
# forloop to iterate throughthe batch size. After every 800 epcohs it will select the next batch size
for b in batch_size:
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
print("Batch Size:",b)
while iter<800:
batch_x,batch_y=next_batch(b,X_train,y_train)
batch_x=batch_x.reshape((b,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Batch Size:",b)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
It is clearly observed that the batch_size of 4 did not improve the accuracy . Model reached a training accuracy of 75% and Testing accuracy of only 19%.
The Batch size 512 defintely improving the training accuray by 3 % from the bench mark accuracy.
Maximum Training Accuracy for Batch_Size =512 is 97.65% and Testing Accuracy is 68.52%
Hence, icreasing the batch size from 128 to 512 definitely improved the accuracy of the model. Clearly decreasing batch size did not help but increasing the batch size to 512 helped. Hence, batch size is aprospective hyper parametr to tune for a RNN LSTM model, preferable batch size of 512
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Initialize the model with the same hyper parameters as the previous model with only changes in num_units. Set the num_units=1. Refer to comments for the change. and run the Tensorflow session to observe results
reset_graph()
#define constants
#unrolled through 28 time steps
time_steps=32
#hidden LSTM units. Set num_units=1
num_units=1
#rows of 28 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
#mnist is meant to be classified in 10 classes(0-9).
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Number of Neurons:",num_units)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
It is observed that just using one Neuron directly affected the accuracy. The Accuracy has a varying trend reaching a high of 80% and a low testing accuracy of 2.7%. Hence, as the number of neurons decreses the accuracy also decreases. It also observed that the accuracy sometimes reaches a high of 80% which is one spike in the data that occurs at 280 or 290th epoch.
Lets check the accuracy for number of neurons as 512. The initla code to load the data and reset the graph remains the samefrom the previous model
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Initialize the parameters as per the previous initial model. Just edit the num_units as 512. Observe the comments to see the change
reset_graph()
#define constants
#unrolled through 28 time steps
time_steps=32
#hidden LSTM units. Edit the num_units as 512
num_units=512
#rows of 28 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
#mnist is meant to be classified in 10 classes(0-9).
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Number of Neurons:",num_units)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
It is observaed that there is a considerable increase in accuracy with the increase in number of neurons. When number of neurons increased from 128 to 500 there is an increase in accuracy , surpassing the accuracy of the very first model. The testing accuracy increased further to 71.81 which is a considerable increase in testing accuracy. Hence, increasing the number of neurons did improve the accuracy considerably.
Training Accuracy= 97.65% Testing accuracy= 71.81%
It is observed that reducing he number of neurons did not help the performance. Infact it decreased the performance of the nwtrok. Increase the number of neuron to 512 improved the accuracy of the testing set which is a good sign.
Hence, number of neurons can defintely be considered as a prospective parameter to tune a RNN LSTM Model. Moreover for this a number of neurons 512 worked the best.
We will now tune the Learning rate to values 0.1,0.001 to see if the learning rate impacts the performance of the network
Lets start woth learning rate=0.1 and number of neurons as 512. As before the code to reset the graph and load the data remains the same as the initla model.
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Next initlaize the hyper parameters as per the initial model. Edit the learning rate to 0.1 nad number of neurons as 512
reset_graph()
#define constants
#unrolled through 28 time steps
time_steps=32
#hidden LSTM units
num_units=512
#rows of 28 pixels
n_input=32
#learning rate for adam
learning_rate=0.1
#mnist is meant to be classified in 10 classes(0-9).
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Learning rate:",learning_rate)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
Increasing the learning rate , provided a varied accuracy in the training. There are varied spikes in the data , where the accuracy also reached an overall 100% as well. But it is not possible to distinguish if increases the learning rate has helped the RNN LSTM model. Lets observe what happens when the learning rate is decreased. This means that the algorithm is going to learn slowly but navigate the loss scenario slowly.
Highest Trainig Accuray = 100%(but unstable) Testing accuracy=2.6%
Next, we will tune the model for learning rate=0.0001 and batch size=128. Code remains the same as the initial model for loading the datatset.
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Reuse the code for initializing hyper parameters as the initial model. Only edit the learning rate to be 0.0001 and num_units as 512.
reset_graph()
#define constants
#unrolled through 28 time steps
time_steps=32
#hidden LSTM units
num_units=512
#rows of 28 pixels
n_input=32
#learning rate for adam
learning_rate=0.0001
#mnist is meant to be classified in 10 classes(0-9).
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Learning rate:",learning_rate)
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
After increasing the learning rate it is observed that, the accuracy increases in comparison to decreasing the learnin rate. However, it does not improve the accuracy over the bench mark accuracy of 94.53% and a testing accuracy of 68.59%. Hence decreasing the learning rate did help but not considerable. The optimal performanceof the network was at a learning rate of 0.001 as per the very first model before performing hyper parameter tuning.
It is observed that decreasing the learning rate to 0.0001 and number of neurons has 512 provided an increase in accuracy in comparison to a learning rate of 0.1 and number of neurons as 512. But the most desirable result was at a leanring rate of 0.001 and number of neurons as 128.
Hence, though the learning rate affected the performance of the model with number of neurons as 128 . This combination would be considered a less prospective hyper parameter ti imporve accuracy of the RNN LSTM Model. However, learning rate by it self could be considered a prospective parameter to tune for the RNN LSTM model
In this section we will tune the Optimizer with values RMSProp, Adagrad
Lets start by training the Optimizer -Adagrad. Reuse the code from the initial RNN LSTM MOdel model for restting the graph and loading the data.
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Initialize the hyper parameters as per the inital model. Ensure to change the optimizer to Adagrad optimizer. Closely see the comments to notice the change
reset_graph()
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=512
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.0001
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
## Change optimizer to Adagrad
opt=tf.train.AdagradOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Optimizer: Adagrad")
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
The model used before performing hyper parameter tuning had an accuracy of 94.53% and a testing accuracy of 68.59% with the Adam Optimizer. Using the Adagrad optimizer did not help improve the accuracy beyond 10% and the tesing accuracy is consistently oor at 7%. Hence, changing the optimizer plays an important role in the RNN Model and selected the right opitmizer is also significant.
Training Accuracy=10% Testing Accuracy=7%
The optimizer used in this case is the RMSProp Optimizer. Reuse the code from the initla model to load the data and reset the graph.
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Reuse the code to set the hyper parameter values. Ensure to replace the optimizer with RMSProp Optimizer. Closely see comments to view the change in optimizer
reset_graph()
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.0001
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
# change the optimization to RMSProp here
opt=tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Optimizer: RMSProp")
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
Using the RMSPRop optimizer also did not help much but did improve the accuracy to 50%.
Train Accuracy = 50% Test Accuracy=34.62%
Hence, RMSProp optimizer did help but from the above 3 trials it is observed that the Adam Optimizer performed best over Adagrad and RMSProp.
The optimizer is prospective hyper parameter to tune the RNN LSTM model. Though the Adam Optimizer out performed all the other optimizers
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Reuse the code for initializing the hyper parameters from the previous model. Add an additional parameter activation in the Basic LSTM Cell. Observe the comments closely to view the change made.
reset_graph()
#define constants
#unrolled through 32 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 32 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
# add activation softsign . By default value taken is Tanh
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1,activation=tf.nn.softsign)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Activation: Softsign")
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
Teh default activation used when activation is not specified is Tanh. The first model used the default activation of Tanh. Using the Softsign activation function did not improve the accuracy over the existing high accuracy of 94.53% and testing accuracy of 68.59. The Activation function Softsign provides a training accuracy of 92% and a minute increase in testing accuracy of 68.8%. But it would be worth observing the change over more epochs as well
Next we will tune tthe activation function Relu. Loading the data remains the same as the previous model
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
# Dataset consists of a subset of HasyV2 data.Only the alphanumberic characters
# Code referenced -https://www.kaggle.com/usersumit/alphanumeric-handwritten-dataset/data
# The data subset has been drawn from https://github.com/sumit-kothari/AlphaNum-HASYv2/tree/master/output_data_alpha_num
X_FNAME = "alphanum-hasy-data-X.npy"
Y_FNAME = "alphanum-hasy-data-y.npy"
SYMBOL_FNAME = "symbols.csv"
X_load = np.load(X_FNAME)
y_load = np.load(Y_FNAME)
SYMBOLS = pd.read_csv(SYMBOL_FNAME)
SYMBOLS = SYMBOLS[["symbol_id", "latex"]]
#This is using the Scikit Learn Library
X_train, X_test, y_train, y_test = train_test_split(X_load, y_load, test_size=0.3)
print("Train dataset shape")
print(X_train.shape, y_train.shape)
print("Test dataset shape")
print(X_test.shape, y_test.shape)
Intialize the hyper Parameter as in the initial model code. Ensure to change the activation function. Closely see the comments to see the change in activation functions
reset_graph()
#define constants
#unrolled through 28 time steps
time_steps=32
#hidden LSTM units
num_units=128
#rows of 28 pixels
n_input=32
#learning rate for adam
learning_rate=0.001
#mnist is meant to be classified in 10 classes(0-9).
n_classes=116
#size of batch
batch_size=128
# Normalize the training set
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
print("num_classes = ", num_classes)
out_weights=tf.Variable(tf.random_normal([num_units,n_classes]))
out_bias=tf.Variable(tf.random_normal([n_classes]))
#defining placeholders
#input image placeholder
x=tf.placeholder("float",[None,time_steps,n_input])
#input label placeholder
y=tf.placeholder("float",[None,n_classes])
#processing the input tensor from [batch_size,n_steps,n_input] to "time_steps" number of [batch_size,n_input] tensors
input=tf.unstack(x ,time_steps,1)
#defining the network
#change the activation function to relu
lstm_layer=rnn.BasicLSTMCell(num_units,forget_bias=1,activation=tf.nn.relu)
outputs,_=rnn.static_rnn(lstm_layer,input,dtype="float32")
#converting last output of dimension [batch_size,num_units] to [batch_size,n_classes] by out_weight multiplication
prediction=tf.matmul(outputs[-1],out_weights)+out_bias
#loss_function
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
import numpy as np
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
# Xtr, Ytr = np.arange(0, 10), np.arange(0, 100).reshape(10, 10)
# print(Xtr)
# print(Ytr)
train_loss=[]
train_accuracy=[]
epoch_list=[]
#initialize variables
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
train_loss=[]
train_accuracy=[]
epoch_list=[]
iter=1
while iter<800:
batch_x,batch_y=next_batch(batch_size,X_train,y_train)
batch_x=batch_x.reshape((batch_size,time_steps,n_input))
sess.run(opt, feed_dict={x: batch_x, y: batch_y})
if iter %10==0:
epoch_list.append(iter)
acc=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
los=sess.run(loss,feed_dict={x:batch_x,y:batch_y})
train_loss.append(los)
train_accuracy.append(acc)
print("For iter ",iter)
print("Accuracy ",acc)
print("Loss ",los)
print("Activation: Relu")
print("__________________")
iter=iter+1
test_data = X_test.reshape((-1, time_steps, n_input))
print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: y_test}))
plot_loss_epoch()
plot_acc_epoch()
Using the Relu activation function did not have much effect on the accuracy. As the accuracy decreased by changing the activation to Relu. But it is prospective to try various activation functions as they seem to have an effect on the accuracy and performance of the network.
Training Accuracy - 89% Testing Accuracy -67.7%
It is observed that Tanh performed the best. But it would be interesting to note the change in performance for larger number of epochs for Adagrad optimizer.
Hence Activation Functions must definitely be a prospective hyper parameter tune the RNN LSTM model with preferable values of Tanh or Adagrad.
Parameters to tune for an RNN are as follows :
The prospective parameter to tune for an RNN LSTM model are Number of Epochs, Learning rate, Batch size, Number of neurons, optimizers and activation functions. Though the combination of Learning Rate and number of neurons did notimprove performance of the model and can be optinally tuned.
Summary

A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. These widely used in deep belief networks . They are popularly used for classification problems as well
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. Each image as dimensions of28X28 images.The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total.
handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either 0 to 9.
This is popular dataset set used in Datascience and a popular Hello World Dataset
We will be using the MNIST Datset from the Tensorflow Library. This will be automatically used from the tensorflow code. No pre- requisites are present.

pip install xrbm
You can also refer to the below website for more details
The number of the visible units of the RBM equals to the number of the dimensions of the training data. Number of visible units selected 784 which is the total dimension of the input data. The number of hidden units are 200. The batch size selected is 128 and number of epochs is 100 for the first trial. This uses the Contrastive Divergence with K Gibbs Samples.
Let see the Tensorflow code for the RBM
Lets see the tensorflow code for RBM in steps.
Lets import the necessary libraries first. After which we download the data from the tensorflow libraray automatically. Following which we will initialize the hyper parameters. Next we will define placeholders, weights and bias and apply contrastive divergence with gibb sampling. Lets see how the code looks below
import numpy as np
import tensorflow as tf
%matplotlib inline
import matplotlib.pyplot as plt
from IPython import display
#Uncomment the below lines if you didn't install xRBM using pip and want to use the local code instead
#import sys
#sys.path.append('../')
import xrbm.models
import xrbm.train
import xrbm.losses
from xrbm.utils.vizutils import *
Downloading the data from the tensorflow libraray and extracting the files
from tensorflow.examples.tutorials.mnist import input_data
data_sets = input_data.read_data_sets('MNIST_data', False)
training_data = data_sets.train.images
As mentioned above the num_vis is number of visible layers which is the total dimension of the image. In this case that is 784. We initliaze number of epochs, number of hidden layers, batch size as well
num_vis = training_data[0].shape[0] #=784
num_hid = 200
learning_rate = 0.1
batch_size = 100
training_epochs = 100
We reset the graph inorder to initilaize all variables. We will then apply the hyper parameters to the RBM Model.
# Let's reset the tensorflow graph in case we want to rerun the code
tf.reset_default_graph()
# apply the parameters to the RBM Model. The syntax is specific to the xRBM Library
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, name='rbm_mnist')
Next we create mini batches
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
We create a placeholder for the mini-batch data during training.
We use the CD-k algorithm for training the RBM. For this, we create an instance of the CDApproximator from the xrbm.train module and pass the learning rate to it.
We then define our training op using the CDApproximator's train method, passing the RBM model and the placeholder for the data.
In order to monitor the training process, we calculate the reconstruction cost of the model at each epoch, using the rec_cost_op.
The CD-k algorithm is contrastive divergence used for approximation. For every input it starts a markov chain.
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate) train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructeddata,,, = rbm.gibbs_sample_vhv(batch_data) xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
# initlaize lists in order to strore the reconstruction cost and epochs
recon_cost=[]
epoch_list=[]
fig = plt.figure(figsize=(12,8))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
# compute the reconstruction cost
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
#Print the reconstruction cost
title = ('Epoch %i / %i | Reconstruction Cost = %f'%(epoch, training_epochs, reconstruction_cost))
print(title)
#plot the final image
W = rbm.W.eval().transpose()
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 20), grid_gap=(1,1))
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
This was run for 100 epochs. The genration of the images and running the algorithm without additional computation takes extremely long. The reconstruction cost for 100 epochs is -524.17, which is high indicating the model must be run for a longer period of time to reduce the reconstruction cost.
Let us try and see if we can evaluate hyper parameters that can reduce the reconstruction cost.
We will tune the RBM for activation, regularization, net work initalization, number of hidden layers, regularization and activation.
Since, training the RBM takes extremely long we are going to train it only for 50 epochs. Lets strat with learning rate values 0.01 and 0.00001
We willselect learning rate of 0.01 first. Thecode for initializing the hyper parameter values and running the tensorflow session will remain the same . It is depicted as below. Lets create a function to plot the reconstruction cost vs epochs
# Plot the reconstruction cost vs epochs
def plot_loss_epoch():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Cost vs Epoch', fontsize=15)
plt.plot(epoch_list, cost, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Reconstruction Cost')
Reuse the code from the initla model to tinitialize the hyper parameter values, create the mini batch and run the tensorflow session. Only edit must be made to the learning rate. The learning rate must be modified to 0.01 . See the comments closely for details of change
# reset graph function
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
cost=[]
epoch_list=[]
# change learning rate to 0.01
num_vis = training_data[0].shape[0] #=784
num_hid = 200
learning_rate = 0.01
batch_size = 100
training_epochs = 50
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, name='rbm_mnist')
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate)
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 20), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
cost.append(reconstruction_cost)
epoch_list.append(epoch)
print(title)
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
Learning rate of 0.01 has been effective after observing the clarity of image in comparison to a learning rate of 0.1. Hence, Learning rate plays an importatnt role in improving the performace of the RBM . Though the reconstruction cost is yet high but we must also consider that this has been for trained for 50 epochs.
Reconstruction Cost = -528.190
Lets run the model for hyper parameter learning rate of 0.00001. The code to load the data, initla the hyper parameters and run the tensor flow session remains the same. Only change is the learning rate is set to 0.00001. Observe the change with comments
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
cost=[]
epoch_list=[]
# set learning rate to 0.00001
num_vis = training_data[0].shape[0] #=784
num_hid = 200
learning_rate = 0.00001
batch_size = 100
training_epochs = 50
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, name='rbm_mnist')
# create mini batches
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
# create placeholder
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate)
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
#run the tensorflow session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 20), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
print(title)
cost.append(reconstruction_cost)
epoch_list.append(epoch)
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
The learning rate of 0.00001 increased the reconstruction cost and reduced the clarity of the image as well
reconstruction Cost = -600.41
Learning rate defintely has an effect on the model. But optimal learning rate as per the trials indiacate 0.01 and 0.1 only. Hence, this is a prospective parameter to tune for the RBM
Lets tune the hidden layers for the RBM. Number of Hidden Layers selected are 500 and 100
Let start with Hidden Layers 500. The inital code of the model remains the same. Only change would num_hid =500 and change grid size to (10,50). Watch the comments to see the change
# reset the graph
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
cost=[]
epoch_list=[]
# change num_hid to 0.1
num_vis = training_data[0].shape[0] #=784
num_hid = 500
learning_rate = 0.1
batch_size = 100
training_epochs = 50
# set the model
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, name='rbm_mnist')
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
#create the placeholder
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate)
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
# change the grid size dimension to (10,50)
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 50), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
cost.append(reconstruction_cost)
epoch_list.append(epoch)
print(title)
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
We observe that the images became extremely clear in comparison to the bench mark of th first model. More over the reconstruction cost also reduced drastically. This imples that number of hidden layers plays an important role in tuning
Lets set number of hidden layers to 100 observe the change in performanc. we will reuse the code of the initial model and just change the number of hidden layers to 100. See the comments closely for changes
# reset graph
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
cost=[]
epoch_list=[]
# set the hidden layers to 100
num_vis = training_data[0].shape[0] #=784
num_hid = 100
learning_rate = 0.1
batch_size = 100
training_epochs = 50
# apply hyper parameters to the model
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, name='rbm_mnist')
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
# placeholder introduced
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate)
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
#Change the grid size to (10,10b)
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 10), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
cost.append(reconstruction_cost)
epoch_list.append(epoch)
print(title)
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
we observe that the loss increased when number of hidden layers decresed . The images are also not very clear. Hence, hidden layers as 500 was the most significant.
It is clearly visible that the reconstruction cost decreased when number of hidden layers increased. Hence, number of hidden layers is a prospective parameter to tune for RBM ith larger number of hidden layers the better
Now we will tune the network initialization . We will set a network initialization of to Xavier Intialization and notice if the reconstruction cost decreases or the image get clearer. The code remains the same from the initial model to load the data, initialize the hyper parameters and run the tensorflow session. The only change is that we need to add xavier Initlization as a parameter to the model. Observe the comments below to see the change.
# reset graph
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
# list to plot store epochs and cost
cost=[]
epoch_list=[]
num_vis = training_data[0].shape[0] #=784
num_hid = 100
learning_rate = 0.1
batch_size = 100
training_epochs = 50
## Here , set the initialization to Xavier Initialization as parameter to the model
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, initializer=tf.contrib.layers.xavier_initializer(),name='rbm_mnist')
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate)
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 10), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
cost.append(reconstruction_cost)
epoch_list.append(epoch)
print(title)
# plot th figure
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
It is observed that the reconstruction cost increases but the cost is also consistently decreasing for the model. Hence, we could consider the net work initialization as a prospective parameter to tune , even though it did not give very good results.
Reconstruction Cost=-527.42
Even though the network initialization did not reduce the reconstruction cost, from the graph it can be observed that the cost is decreasing consistently with the number of epochs. Hence, it would be interesting to not if it had an impact as the number of epochs increase . Hence, I will conclude that Network Initialization is a prospective parameter to tune for the model.
Lets observe the impact of regularization on the RBM Network. We will also initialize the activation to sigmoid and retain the Xavier initialization . The code remains the same as the intial model. Only change is adding the reularization and the activation to the model. It has been done as mentioned below. Lok at the comments closely to observe changes.
#reset graph
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
tf.reset_default_graph()
cost=[]
epoch_list=[]
#set hyper parameters
num_vis = training_data[0].shape[0] #=784
num_hid = 100
learning_rate = 0.1
batch_size = 100
training_epochs = 50
#define model
#set the initializer as xavier initializer ad activation as sigmoid
rbm = xrbm.models.RBM(num_vis=num_vis, num_hid=num_hid, initializer=tf.contrib.layers.xavier_initializer(),activation=tf.nn.sigmoid,name='rbm_mnist')
batch_idxs = np.random.permutation(range(len(training_data)))
n_batches = len(batch_idxs) // batch_size
batch_data = tf.placeholder(tf.float32, shape=(None, num_vis))
# set the regularizer here as L1
cdapproximator = xrbm.train.CDApproximator(learning_rate=learning_rate,regularizer=tf.contrib.layers.l1_regularizer(0.001))
train_op = cdapproximator.train(rbm, vis_data=batch_data)
reconstructed_data,_,_,_ = rbm.gibbs_sample_vhv(batch_data)
xentropy_rec_cost = xrbm.losses.cross_entropy(batch_data, reconstructed_data)
# Create figure first so that we use the same one to draw the filters on during the training
fig = plt.figure(figsize=(12,8))
# run the tensorflow session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for batch_i in range(n_batches):
# Get just minibatch amount of data
idxs_i = batch_idxs[batch_i * batch_size:(batch_i + 1) * batch_size]
# Run the training step
sess.run(train_op, feed_dict={batch_data: training_data[idxs_i]})
reconstruction_cost = sess.run(xentropy_rec_cost, feed_dict={batch_data: training_data})
W = rbm.W.eval().transpose()
filters_grid = create_2d_filters_grid(W, filter_shape=(28,28), grid_size=(10, 10), grid_gap=(1,1))
title = ('Epoch %i / %i | Reconstruction Cost = %f'%
(epoch, training_epochs, reconstruction_cost))
cost.append(reconstruction_cost)
epoch_list.append(epoch)
print(title)
#plot the graphs
plt.title(title)
plt.imshow(filters_grid, cmap='gray')
display.clear_output(wait=True)
display.display(fig)
plot_loss_epoch()
The image is extremly clear displaying that regularization does play an imporatnt role for an RBM. Though the reconstruction cost is hight but there is a steady decrease in the cost over number of epochs. The combination of xavier initialization and activation isalso contributing to the clarity of the image.
As future scope it would be interesting to note the indiviual effects of each of the parameters on the RBM Network
Reconstruction Cost = -528.27
The reconstruction cost is very high the image is also clear. Hence, it would be imporatnt to know the individual effects of each of the parameters on the RBM like activation. As of the current result I will consider Regularization as a prospective parameter to tune for the RBM
As it was depicted earlier learning rate of 0.1 and 0.01 were ideal for the performance of the RBM . Decreasing the learning rate decreased the performance of the mdoel as well. Hence learning rate is a prospective parameter
Increase in the number of hidden layers decreased the reconstruction cost drastically. Hence number of hidden layers as 512 is prospective candidate
Though Network Initialization did provide decrease in cost but due to steady decrease in cost this can be consiserd
Regularization is an imporatant arameter as well. Though it ould be interesting to note the individual effect of activation on the network and regularization , the L1 regularization is a prospective candidate as well.
Summary

Generative adversarial networks (GANs) are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks. This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics (though in tests people can tell real from generated in many cases).
Gans consist of a Generator and Discriminator. In essence the generates the images that are provided as inuts and the discriminator tries to identify the diffrences in the generated images from the actual images. The generator tries to increas the loss where as the discriminator tries to minimize the loss. The ideal case is when there are least diffrences between the images generated and the actuall images. Hence, the ideal scenario is when the discriminator loss is least.
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. Each image as dimensions of28X28 images.The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total.
handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either 0 to 9.
This is popular dataset set used in Datascience and a popular Hello World Dataset
We will be using the MNIST Datset from the Tensorflow Library. This will be automatically used from the tensorflow code. No pre- requisites are present.
This network uses the Xavier Initialization for weights and biases. The Generator and Discriminator use the relu activation function, with one hidden layer. It uses the solver Adam. The probability distribution is identified using the activation function sigmoid. The network was initialized with the 10000 epochs.
The ocde for tensor flow consists of the sasme as the previous. We will initalize place holders, weighst and bias , hyper parameter and start the Tensorflow session . In this case it would be dine for generator and Discriminatir. Lets walk through the code now.
Import necessary libraries as below
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
Next, we initialize the network with xavier initialization
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
Assign weights and bias to the generator and the discriminator as shown below
# weights and bias for the discriminator
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
#Weights and bias for the generator
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
# function to create a random sample
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
Defining th layers for the generator and discriminator and the generator. We are giving activation function of sigmoid. Matmul operation is used to generate the images as matrix multiplication
def generator(z):
G_h1 = tf.nn.relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.sigmoid(G_log_prob)
return G_prob
# The discriminator(x) takes MNIST image(s) and return a scalar which represents a probability of real MNIST image.
def discriminator(x):
D_h1 = tf.nn.relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.sigmoid(D_logit)
return D_prob, D_logit
Function to plot images afetr training is defimned next
# Plot images
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
Next, we create functions to plot the loss vs epoch for the generator and discriminator
def plot_loss_epoch_generator():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Train Loss vs Epoch', fontsize=15)
plt.plot(epoch_list, g_train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Train Loss')
def plot_loss_epoch_discriminator():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Train Loss vs Epoch', fontsize=15)
plt.plot(epoch_list, d_train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Train Loss')
Evaluating the sample and storing values of the discriminator and generator
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
Defining the cost function
# D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
# D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
G_loss = -tf.reduce_mean(tf.log(D_fake))
Defining the the optimizer for the genartor and discriminator
D_solver = tf.train.AdamOptimizer().minimize(D_loss, var_list=theta_D)
G_solver = tf.train.AdamOptimizer().minimize(G_loss, var_list=theta_G)
mb_size = 128
Z_dim = 100
Lets read the data that is downloaded automatically
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
Start the tensroflow session. Images generated will be stored ina folder out that will be automatically created. There is no absolute path. The name of the folder will be out
#empty lists are initialized
g_train_loss=[]
d_train_loss=[]
epoch_list=[]
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#create a folder if it does not exist
if not os.path.exists('out/'):
os.makedirs('out/')
i = 0
#start training. stote the images generated in the out folder
for it in range(10000):
if it % 100 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 1000 == 0:
# epoch_list.append(it)
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
d_train_loss.append(D_loss_curr)
print('G_loss: {:.4}'.format(G_loss_curr))
d_train_loss.append(G_loss_curr)
print()
# plot_loss_epoch_generator()
# plot_loss_epoch_discriminator()
It is observed that in the beginning the generator loss is low and the discriminator loss is less. This is because initially the genrator is lerning the image and in this process the discriminator willfind alot of diffrences between the real image and the fake images. As the number of epochs increases the discrimator loss decreases and the generato loss increases. Below is the snapshot of the first image and the last image generated.
The final losses are as follows:
D loss: 0.6848 G_loss: 3.047
The first image

The Last image

The images will be present in the out folder that is automatically created in the path
Lets perform hyper parameter tuning for the GAN. Lets start by tuning the activation function.
Next , we are going to observe the performance when the activation function is Relu. The initial model code remains the same. Only the activation function for the generator and discriminator is chaged to Relu. Observe the change in the code. The images will be saved in out_activation_relu folder created automatically
def plot_loss_epoch_generator():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Generator Train Loss vs Epoch', fontsize=15)
plt.plot(epoch_list, g_train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Generator Train Loss')
def plot_loss_epoch_discriminator():
plt.figure(figsize=(18, 5))
plt.subplot(1, 2, 1)
plt.title('Discriminator Train Loss vs Epoch', fontsize=15)
plt.plot(epoch_list, d_train_loss, 'r-')
plt.xlabel('Epoch')
plt.ylabel('Discriminator Train Loss')
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as npd_train_loss
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
%matplotlib inline
epoch_list=[]
d_train_loss=[]
g_train_loss=[]
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
def generator(z):
G_h1 = tf.nn.relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.relu(G_log_prob)
return G_prob
#change activation to relu
def discriminator(x):
D_h1 = tf.nn.relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.relu(D_logit)
return D_prob, D_logit
#change activation to relu
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
# D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
# G_loss = -tf.reduce_mean(tf.log(D_fake))
# Alternative losses:
# -------------------
D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = D_loss_real + D_loss_fake
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.ones_like(D_logit_fake)))
D_solver = tf.train.AdamOptimizer().minimize(D_loss, var_list=theta_D)
G_solver = tf.train.AdamOptimizer().minimize(G_loss, var_list=theta_G)
#minibatch size
mb_size = 128
Z_dim = 100
r#reading the dataset from tensorflo
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#creates a floder if it does not exist
if not os.path.exists('out_activation_relu/'):
os.makedirs('out_activation_relu/')
i = 0
for it in range(10000):
if it % 1000 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out_activation_relu/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 100 == 0:
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
print('G_loss: {:.4}'.format(G_loss_curr))
print()
d_train_loss.append(D_loss_curr)
g_train_loss.append(G_loss_curr)
epoch_list.append(it)
plot_loss_epoch_generator()
plot_loss_epoch_discriminator()
It is obeserved that the generator loss increases and the discrimnator loss decreases as number of epochs increases.
The first image

last Image

Hence, having the activation sigmoid was better as the image was clear, but rely can be considered a good possibility.
Loss D loss: 0.9232 G_loss: 2.121
Next lets coonsider the activation function Leaky relu. The initial model code remains the same as we see below. Only the activation function will change. Obseve the comments closely to see the change made.
# Hyper parameter tuning Activation Function Leaky Relu
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as npd_train_loss
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
%matplotlib inline
epoch_list=[]
d_train_loss=[]
g_train_loss=[]
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
#activation function leaky relu for generator
def generator(z):
G_h1 = tf.nn.leaky_relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.leaky_relu(G_log_prob)
return G_prob
#activation function for leaky relu for discriminator
def discriminator(x):
D_h1 = tf.nn.leaky_relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.leaky_relu(D_logit)
return D_prob, D_logit
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
# D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
# G_loss = -tf.reduce_mean(tf.log(D_fake))
# Alternative losses:
# -------------------
D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = D_loss_real + D_loss_fake
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.ones_like(D_logit_fake)))
D_solver = tf.train.AdamOptimizer().minimize(D_loss, var_list=theta_D)
G_solver = tf.train.AdamOptimizer().minimize(G_loss, var_list=theta_G)
mb_size = 128
Z_dim = 100
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
if not os.path.exists('out_activation_leakyrelu/'):
os.makedirs('out_activation_leakyrelu/')
i = 0
for it in range(10000):
if it % 1000 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out_activation_leakyrelu/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 100 == 0:
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
print('G_loss: {:.4}'.format(G_loss_curr))
print()
d_train_loss.append(D_loss_curr)
g_train_loss.append(G_loss_curr)
epoch_list.append(it)
plot_loss_epoch_generator()
plot_loss_epoch_discriminator()
It can be clearly observed that the discriminator loss gradually decreases as the generator loss increases. Lets have a look at the images generated :
1st Image on the first epoch:

Last Image on the last epoch

D loss: 1.518 G_loss: 1.112
Leaky Relu and Relu have nearly the same outputs for 7000 epochs, but it is much clear with the sigmoid activation. Hence, the sigmoid activation function most preferred. Hnce, activation is an imporatnt parameter to tune for the GAN
Now lets, observe the combination of optimizer and the learning rate. We will first consider the optimizer Stochastic Gradient Descent and Lesrning rate of 0.001
The code remains the same as the initial model. The only changes will be with the optimizer and the learning rate. Observe the change in the code by following the comments
# Hyper parameter tuning Activation Function Leaky Relu
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as npd_train_loss
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
%matplotlib inline
epoch_list=[]
d_train_loss=[]
g_train_loss=[]
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
def generator(z):
G_h1 = tf.nn.relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.sigmoid(G_log_prob)
return G_prob
def discriminator(x):
D_h1 = tf.nn.relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.sigmoid(D_logit)
return D_prob, D_logit
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
# D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
# G_loss = -tf.reduce_mean(tf.log(D_fake))
# Alternative losses:
# -------------------
D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = D_loss_real + D_loss_fake
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.ones_like(D_logit_fake)))
#change the optimizer with the learning rate here . Set optimizer to SGD and learning rate to 0.001
D_solver = tf.train.GradientDescentOptimizer(0.001).minimize(D_loss, var_list=theta_D)
G_solver = tf.train.GradientDescentOptimizer(0.001).minimize(G_loss, var_list=theta_G)
mb_size = 128
Z_dim = 100
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
if not os.path.exists('out_optimizer_SGD/'):
os.makedirs('out_optimizer_SGD/')
i = 0
for it in range(10000):
if it % 1000 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out_optimizer_SGD/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 100 == 0:
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
print('G_loss: {:.4}'.format(G_loss_curr))
print()
d_train_loss.append(D_loss_curr)
g_train_loss.append(G_loss_curr)
epoch_list.append(it)
plot_loss_epoch_generator()
plot_loss_epoch_discriminator()
Observation:
It is interesting to observe that there is a distinct rise and fall in the Discriminator and generator losses using the SGD Optimizer with a learning rate of 0.001.
Comparing the first and last image


The images are fairly blurred indicating that the SGD was not extremely effective with a learning rate of 0.01. Lets try and reduce the learning rate and observe if that had an effect. But it must be observed that the discriminator loss decreased considerably
But it is clear that tuning the optimizer is extremy important
D loss: 0.2892 G_loss: 2.531
Lets change the learning rate to 0.1 with the optimizer as SGD
The initial code for he model remains the same. Only diffrence is in the learning rate. Observe the comments to view the changes made
# Hyper parameter tuning Activation Function Leaky Relu
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as npd_train_loss
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
%matplotlib inline
epoch_list=[]
d_train_loss=[]
g_train_loss=[]
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
def generator(z):
G_h1 = tf.nn.relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.sigmoid(G_log_prob)
return G_prob
def discriminator(x):
D_h1 = tf.nn.relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.sigmoid(D_logit)
return D_prob, D_logit
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
# D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
# G_loss = -tf.reduce_mean(tf.log(D_fake))
# Alternative losses:
# -------------------
D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = D_loss_real + D_loss_fake
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.ones_like(D_logit_fake)))
# change the optimizer to SGD and the learning rate to 0.1
D_solver = tf.train.GradientDescentOptimizer(0.1).minimize(D_loss, var_list=theta_D)
G_solver = tf.train.GradientDescentOptimizer(0.1).minimize(G_loss, var_list=theta_G)
mb_size = 128
Z_dim = 100
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# this folder will be automatically created and the images will be stored here
if not os.path.exists('out_optimizer_SGD_LR/'):
os.makedirs('out_optimizer_SGD_LR/')
i = 0
for it in range(10000):
if it % 1000 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out_optimizer_SGD_LR/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 100 == 0:
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
print('G_loss: {:.4}'.format(G_loss_curr))
print()
d_train_loss.append(D_loss_curr)
g_train_loss.append(G_loss_curr)
epoch_list.append(it)
plot_loss_epoch_generator()
plot_loss_epoch_discriminator()
Increasing the learnignrate did directly affect the loss . As there is a change in the graphs of the generator and discriminator loss . There seems to rise and drop in loss with the increase in the learning rate. Lets compare the images
First Image

Last Image

Increasing the learning rate did have an effect as now it leanrt the areas of pixels. Addtionally, in comparison to previous images with SGD (learning rate of 0.001) the pixels were yet sprawled and not localized. At this point , it ould be interesting to see of training the very same model for larger number of epochs would have an impact.
D loss: 0.3045 G_loss: 2.856
Lets increase the number of epochs to see if there. But it must be noted that the discimator loss is low in comparison to bench mark model which is a good sign.
Next, lets keep all hyper parameters as they are and just increase number of epochs to 20000 to see increasing the learning rate had an impact
## Increasing number of Epochs with SGD and LR -0.1
# Hyper parameter tuning Activation Function Leaky Relu
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as npd_train_loss
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
%matplotlib inline
epoch_list=[]
d_train_loss=[]
g_train_loss=[]
def xavier_init(size):
in_dim = size[0]
xavier_stddev = 1. / tf.sqrt(in_dim / 2.)
return tf.random_normal(shape=size, stddev=xavier_stddev)
X = tf.placeholder(tf.float32, shape=[None, 784])
D_W1 = tf.Variable(xavier_init([784, 128]))
D_b1 = tf.Variable(tf.zeros(shape=[128]))
D_W2 = tf.Variable(xavier_init([128, 1]))
D_b2 = tf.Variable(tf.zeros(shape=[1]))
theta_D = [D_W1, D_W2, D_b1, D_b2]
Z = tf.placeholder(tf.float32, shape=[None, 100])
G_W1 = tf.Variable(xavier_init([100, 128]))
G_b1 = tf.Variable(tf.zeros(shape=[128]))
G_W2 = tf.Variable(xavier_init([128, 784]))
G_b2 = tf.Variable(tf.zeros(shape=[784]))
theta_G = [G_W1, G_W2, G_b1, G_b2]
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
def generator(z):
G_h1 = tf.nn.relu(tf.matmul(z, G_W1) + G_b1)
G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
G_prob = tf.nn.sigmoid(G_log_prob)
return G_prob
def discriminator(x):
D_h1 = tf.nn.relu(tf.matmul(x, D_W1) + D_b1)
D_logit = tf.matmul(D_h1, D_W2) + D_b2
D_prob = tf.nn.sigmoid(D_logit)
return D_prob, D_logit
def plot(samples):
fig = plt.figure(figsize=(4, 4))
gs = gridspec.GridSpec(4, 4)
gs.update(wspace=0.05, hspace=0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(28, 28), cmap='Greys_r')
return fig
G_sample = generator(Z)
D_real, D_logit_real = discriminator(X)
D_fake, D_logit_fake = discriminator(G_sample)
# D_loss = -tf.reduce_mean(tf.log(D_real) + tf.log(1. - D_fake))
# G_loss = -tf.reduce_mean(tf.log(D_fake))
# Alternative losses:
# -------------------
D_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_real, labels=tf.ones_like(D_logit_real)))
D_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.zeros_like(D_logit_fake)))
D_loss = D_loss_real + D_loss_fake
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logit_fake, labels=tf.ones_like(D_logit_fake)))
D_solver = tf.train.GradientDescentOptimizer(0.1).minimize(D_loss, var_list=theta_D)
G_solver = tf.train.GradientDescentOptimizer(0.1).minimize(G_loss, var_list=theta_G)
mb_size = 128
Z_dim = 100
mnist = input_data.read_data_sets('../../MNIST_data', one_hot=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
if not os.path.exists('out_optimizer_SGD_LR_Epochs/'):
os.makedirs('out_optimizer_SGD_LR_Epochs/')
i = 0
# increase the learning rate to 20000
for it in range(20000):
if it % 1000 == 0:
samples = sess.run(G_sample, feed_dict={Z: sample_Z(16, Z_dim)})
fig = plot(samples)
plt.savefig('out_optimizer_SGD_LR_Epochs/{}.png'.format(str(i).zfill(3)), bbox_inches='tight')
i += 1
plt.close(fig)
X_mb, _ = mnist.train.next_batch(mb_size)
_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: sample_Z(mb_size, Z_dim)})
_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: sample_Z(mb_size, Z_dim)})
if it % 100 == 0:
print('Iter: {}'.format(it))
print('D loss: {:.4}'. format(D_loss_curr))
print('G_loss: {:.4}'.format(G_loss_curr))
print()
d_train_loss.append(D_loss_curr)
g_train_loss.append(G_loss_curr)
epoch_list.append(it)
plot_loss_epoch_generator()
plot_loss_epoch_discriminator()
Increasing the number epochs with increase in elarning rate and optimizer as SGD did improve the quality of image generate. It is prominent by the decrease in discriminator loss
D loss: 0.2999 G_loss: 2.861
This loss is very less in comparison to the loss of the bench mark model.


It is observed that when the learning rate was increased the with the Stochastic gradient descent the loss Decreased further. It is worth mentioning that even with alearning rate of 0.01 the loss was much less than that of the bench mark model Hence , the optimizer and learning rate are imporatant combination to tune for a GAN.
It is observed that tuning the activation function plays an imporatant role in tuning the Gan. Though the combination of Relu and Sigmoid for the provided the least loss for the discriminator
Training the model with SGD and learning rate of 0.01 and then 0.1 respectively reduced the loss significantly . It was much better thanthe loss of the bench mark model. Additonally increasing the number of epochs with increased learning rate helped further.
To conclude the activation function , optimizer and learning rate are imporatnt and prospective parameters to tune for the GAN
Summary

An autoencoder is an artificial neural network used for unsupervised learning of efficient codings. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for the purpose of dimensionality reduction. Recently, the autoencoder concept has become more widely used for learning generative models of data.
The MNIST database of handwritten digit, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. Each image as dimensions of28X28 images.The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total. It cnsists of he folowing : handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either 0 to 9.
This is popular dataset set used in Datascience and a popular Hello World Dataset
We will be using the MNIST Datset from the Tensorflow Library. This will be automatically used from the tensorflow code. No pre- requisites are present.
This is a denoising Autoencoder. The autoencoder uses an activation of leaky relu. The encoder and decoder consist of a convolutional layer and uses the Sigmoid Activation with the Sigmoid Cross entropy cost function having using the Adam Optimizer. It is initialized with a learning rate of 0.00001 noise factor of 0.5.
The structure of code for the autoencoder for tensorflow is similar. It consists of place holders, hyper parameter initilaization and running the tensorflow session . Lets walk through the code
The encoder has two convolutional layers and two max pooling layers. Both Convolution layer-1 and Convolution layer-2 have 32-3 x 3 filters. There are two max-pooling layers each of size 2 x 2.
Import necessary libraries
import numpy as np
import sys
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline
read the data from the MNISTdataset Tensorflow
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Define the input and output placeholders
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
Define activation functions as leaky relu
#alpha is negative slope coefficient
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
Defining the encoder. It consists of 2 convolutional layers and max pooling layers. Each convolutional layer down samples the images
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
Each decoder upsamples the image
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
Define the loss , learning rate and optimizer
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=targets_)
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost) #optimizer
Lets train the model and initialize the tensorflow session. Follow the comments closely to understand the code.
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate
lr=1e-5
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
#training in batches
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#introduce noise for the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#introduce noise for train set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#calculate the cost
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
# Plot the loss and epochs
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
plt.plot(range(e+1), loss, 'bo', label='Training loss')
plt.plot(range(e+1), valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
# plot real image, noise image and generated image
batch_x= mnist.test.next_batch(10)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(10):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
for i in range(10):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(10):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
writer.close()
sess.close()
We observe that the training loss and testing loss go hand in hand for each epoch. This is good performance as the training and testing is consistent. This was run for 5 epochs as the generation of images took extremely long on the current computational power. However, it is observed that for 5 epochs the autoencoder has done a good job as the images have been generated and can be distinguished one from the other to the extent that we have faintly recognize each digit. The parameters used include the Adam Optimizer, 2 layered convolution neural network for the encoder and the decoder. The activation used is Relu.
Training loss: 0.1575 Validation loss: 0.1575
Next, we will tune various hyper parameters controlling the Auto Encoder
Lets start by tuning the optimizer with RMSPROp. Initial code of the model remains the same. Lokk for changes in the comments
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=targets_)
# change optimizer to RMSPRop
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.RMSPropOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
lr=1e-5
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
writer.close()
sess.close()
Observation
There is clearly a difference in the images generated by the adam optimizer and the RMSProp Optimizer. This clearly shows that the optimizer does play an important role for the Autoencoder. Use of the RMSProp Optimizer did not help as the images generated by the decoder are not extremely clear. Additionally the validation and training losses also reduce consistently. Lets explore the Adadelta optimizer next .
Training loss: 0.1678 Validation loss: 0.1681
let use the Optimizer Adadelta. Reuse the code from the first model. Observe comments to track changes
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=targets_)
# change optimizer here to Adadelta
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.AdadeltaOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
lr=1e-5
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
writer.close()
sess.close()
Using the Adadelta optimizer throws the validation off slightly in comparison to the training loss. In most cases the validation loss is greater than the training loss. Additonally, comparing the images shows that the Adam Optimizer performed the best in comparison to RMSProp and Adadelta. Though it possible to monitor the performance of the network with RMSProp for larger number of epochs to check the performance.
Training loss: 0.6970 Validation loss: 0.6970
The adam optimizer performed the best. this is a prospective parameter to tune
Lets change the loss function to Hinge_loss. Observe the change in code
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
#change to hinge_loss
loss = tf.losses.hinge_loss(targets_,logits)
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
lr=1e-5
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
# writer.close()
sess.close()
Observation:
Selecting the Hinge_Loss cost function over the sigmoid_cross_entropy function also slightly affected the performance of the neural network. Using the Hinge_Loss function had a similar output as the sigmoid cross entropy functiion. Although, it would be interesting to observe their performance over a few 100 epochs to correctly distinguish the performance. But clearly hinge_loss can also be used to improve the autoencoder model.
Training loss: 0.1848 Validation loss: 0.1855
Next we will use reduce_sum and sigmoid_cross_entropy (without logits). Observe the changes in the code. Code remains same as the intial model
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
# use loss function sigmoid cross entrpy
loss = tf.losses.sigmoid_cross_entropy(targets_,logits)
learning_rate=tf.placeholder(tf.float32)
#change to reduce_sum
cost = tf.reduce_sum(loss) #cost
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
lr=1e-5
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
# writer.close()
sess.close()
Using the sigmoid_cross_entropy without logits and calculating the loss as the reduce_sum produced a similar trend of losses between the training set and the validation set. Though the clarity of images suggests that the using the tf.nn.sigmoid_cross_entropy_with_logits with the function of reduce_mean over the tf.nn.sigmoid_cross_entropy with the loss function as reduce_sum. The main difference between both the loss functionas are explained below :
Training loss: 0.1586 Validation loss: 0.1584
Computes sigmoid cross entropy given logits. Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time.
tf.losses.sigmoid_cross_entropy Creates a cross-entropy loss using tf.nn.sigmoid_cross_entropy_with_logits. Basically weights act as the coefficient of the loss.If weights is a tensor of shape [batch_size], then the loss weights apply to each corresponding sample.
sigmid cross entropy with an wothoutlogits can be considered for training autoencoders
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=targets_)
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
lr=1e-5
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
#change noise here
noise_factor = 0.2
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.2
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
# writer.close()
sess.close()
Decreasing the noise from 0.5 to 0.3 did not help the autoencoder. The loss trend for the training and validation set remained the same, but noise did not improve the ability of the decoder to generate the image better. Hence, reducing noisehyper parametr from 0.5 to 0.3 did not help the autoencoder
Lets try a learning rate of 0.01 . Initla code remains same. Observe comments for changes
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
def reset_graph(seed=2018):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
reset_graph()
# Place holders for the network
inputs_ = tf.placeholder(tf.float32,[None,28,28,1])
targets_ = tf.placeholder(tf.float32,[None,28,28,1])
# Activation Function Relu
def lrelu(x,alpha=0.1):
return tf.maximum(alpha*x,x)
### Encoder
with tf.name_scope('en-convolutions'):
conv1 = tf.layers.conv2d(inputs_,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv1')
# Now 28x28x32
with tf.name_scope('en-pooling'):
maxpool1 = tf.layers.max_pooling2d(conv1,pool_size=(2,2),strides=(2,2),name='pool1')
# Now 14x14x32
with tf.name_scope('en-convolutions'):
conv2 = tf.layers.conv2d(maxpool1,filters=32,kernel_size=(3,3),strides=(1,1),padding='SAME',use_bias=True,activation=lrelu,name='conv2')
# Now 14x14x32
with tf.name_scope('encoding'):
encoded = tf.layers.max_pooling2d(conv2,pool_size=(2,2),strides=(2,2),name='encoding')
# Now 7x7x32.
#latent space
# Now 7x7x32.
#latent space
### Decoder
with tf.name_scope('decoder'):
conv3 = tf.layers.conv2d(encoded,filters=32,kernel_size=(3,3),strides=(1,1),name='conv3',padding='SAME',use_bias=True,activation=lrelu)
#Now 7x7x32
upsample1 = tf.layers.conv2d_transpose(conv3,filters=32,kernel_size=3,padding='same',strides=2,name='upsample1')
# Now 14x14x32
upsample2 = tf.layers.conv2d_transpose(upsample1,filters=32,kernel_size=3,padding='same',strides=2,name='upsample2')
# Now 28x28x32
logits = tf.layers.conv2d(upsample2,filters=1,kernel_size=(3,3),strides=(1,1),name='logits',padding='SAME',use_bias=True)
#Now 28x28x1
# Pass logits through sigmoid to get reconstructed image
decoded = tf.sigmoid(logits,name='recon')
#Defining the learning rate and cost
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=targets_)
learning_rate=tf.placeholder(tf.float32)
cost = tf.reduce_mean(loss) #cost
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost) #optimizer
# Training
sess = tf.Session()
#tf.reset_default_graph()
# saver = tf.train.Saver()
loss = []
valid_loss = []
epoch_list=[]
display_step = 1
epochs = 5
batch_size = 64
#lr=[1e-3/(2**(i//5))for i in range(epochs)]
#learning rate value
#chnage learning rate value
lr=0.01
# Start the Tensorflow Session
sess.run(tf.global_variables_initializer())
# writer = tf.summary.FileWriter('./graphs', sess.graph)
for e in range(epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for ibatch in range(total_batch):
batch_x = mnist.train.next_batch(batch_size)
batch_test_x= mnist.test.next_batch(batch_size)
imgs_test = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in the test set
noise_factor = 0.5
x_test_noisy = imgs_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs_test.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
imgs = batch_x[0].reshape((-1, 28, 28, 1))
#Inducing noise in training set
x_train_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
# Loss for the training set
batch_cost, _ = sess.run([cost, opt], feed_dict={inputs_: x_train_noisy,
targets_: imgs,learning_rate:lr})
#loss for the testing set
batch_cost_test = sess.run(cost, feed_dict={inputs_: x_test_noisy,
targets_: imgs_test})
if (e+1) % display_step == 0:
print("Epoch: {}/{}...".format(e+1, epochs),
"Training loss: {:.4f}".format(batch_cost),
"Validation loss: {:.4f}".format(batch_cost_test))
loss.append(batch_cost)
valid_loss.append(batch_cost_test)
epoch_list.append(e)
#plotting the validation and training loss
plt.plot(epoch_list, loss, 'bo', label='Training loss')
plt.plot(epoch_list, valid_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.legend()
plt.figure()
plt.show()
# saver.save(sess, 'encode_model')
#understanding the output for the testing set
# printing origibal, noise induced and generated images
batch_x= mnist.test.next_batch(3)
#inducing noise for the testing set
imgs = batch_x[0].reshape((-1, 28, 28, 1))
noise_factor = 0.5
x_test_noisy = imgs + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=imgs.shape)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
recon_img = sess.run([decoded], feed_dict={inputs_: x_test_noisy})[0]
plt.figure(figsize=(20, 4))
plt.title('Reconstructed Images')
print("Original Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(imgs[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Noisy Images")
#noisy images
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(x_test_noisy[i, ..., 0], cmap='gray')
plt.show()
plt.figure(figsize=(20, 4))
print("Reconstruction of Noisy Images")
for i in range(3):
plt.subplot(2, 10, i+1)
plt.imshow(recon_img[i, ..., 0], cmap='gray')
plt.show()
# writer.close()
sess.close()
Using a learning rate of 0.01 skewed the losses completely. The training and validation loss are no longer consistently decreasing. The losses finally reduce for the training set , but not in alignment with testing set. Lets compare the quality of images with learning rate 1e-5 and 0.01 learning rate 1e-5
0.0999 Validation loss: 0.1006

The learning rate 0.01 provided defintely more clarity than the learning rate 1e-5.
It is observed that the Autoencoder completely misses the pixel values required for 5 epochs for a learning rate of 0.1. Here since the learning rate is high it quickly traverses the loss scenario, missing the pixel values during traversal. Hence, a learning rate of 0.01 is most optiml for the autoencoder. Additionally, if the Losses graph is compared there is a steep increase in the loss as well. Hence , learning rate also affects the Denoising Autoencoder and a parameter that should be tuned
Tuning the Autoencoder depicts that the hyper parameter noise, loss function as Hinge Loss and Sigmoid Cross Entropy seem to work the best. Reducing the noise helped improving the loss as presumably the auto encoder can learn the image better. Additionally, learning rate of 0.01 provided a very low loss 0.0999 and 0.1006 for the encoder and decoder respectively. Hence, the prospective hyper parameters to tune first while tuning an Autoencoder are as follows:
Summary

The MLP code is based on the following:
[1] Tomar,Nikhil. “Iris Datset Classifivation using Tensorflow”. github.io, October 11. 2017. Web. 22 April. 2018. (no licence specified) Web Link :
http://idiotdeveloper.tk/iris-data-set-classification-using-tensorflow-multilayer-perceptron/
Running the Tensorflow Session is based on the following article:
[2] Vikram K. “Deep Learning”. github.io, August 28. 2016. Web. 22 April. 2018. (no licence specified)
Weblink : https://github.com/Vikramank/Deep-Learning-/blob/master/Iris%20data%20classification.ipynb
.youtube.com/watch?v=a5BUunInTQU&t=1227s
[3] Understanding basics of Deep Neural Networks using the following videos by MIT
Deep Learning - https://www.youtube.com/watch?v=JN6H4rQvwgY&feature=youtu.be
[1] Understanding Xavier Initialization
Weblinks:
http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization
[2] HVASS Laboratories - Tensorflow Tutorials 06 - CIFAR10
Weblink :
https://www.youtube.com/watch?v=3BXfw_1_TF4
[3] Magnus Erik Hvass Pedersen. “Tensorflow Tutorials”. github.io, Licenced by MIT ,December 16th. 2016. Web. 23 April. 2018.
Used for visualizing the CIFAR 10 Data
Weblink : https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/06_CIFAR-10.ipynb
[4] Ataspinar. “Building Convolutional Neural Networks ith Tensorflow”. github.io, December 16th. 2016. Web. 23 April. 2018.
LENET-5 and Like CNN code is based on the following article https://github.com/taspinar/sidl
[5] Ataspinar. “Building Convolutional Neural Networks ith Tensorflow”., December 16th. 2016. Web. 23 April. 2018.
Web Link: http://ataspinar.com/2017/08/15/building-convolutional-neural-networks-with-tensorflow/
[6] https://en.wikipedia.org/wiki/Convolutional_neural_network [7] https://en.wikipedia.org/wiki/Hinge_loss
[8] The data is accessed from the following link
https://github.com/sumit-kothari/AlphaNum-HASYv2
[9] Jasdeep06. “Understanding-LSTM-in-Tensorflow-MNIST”. github.io, 10 Sept. 2017. Web. 18 April. 2018. The netwrok Structure and exploratory data analysis functions is based from the following link
https://jasdeep06.github.io/posts/Understanding-LSTM-in-Tensorflow-MNIST/
[10] Kothari,Sumit. “Alpha- Numeric Handwritten Dataset”. , 10 Sept. 2017. Web. 18 April. 2018.
Data preprocessing is based on the below link
https://www.kaggle.com/usersumit/basic-eda-keras-ann-model/notebook
[11] Other references
Chentinghao,Tinghao. “Tensorflow RNN Tutorial MNIST”.,Medium, 9th January. 2018. Web. 18 April. 2018.
https://medium.com/machine-learning-algorithms/mnist-using-recurrent-neural-network-2d070a5915a2
https://github.com/chentinghao/tinghao-tensorflow-rnn-tutorial/blob/master/mnist_rnn.ipynb
[12] Implement Tensorflow Next Batch, Stack Overflow
[13] Background Reserach on RNN LSTM done using the below article
https://deeplearning4j.org/lstm.html
[14] Omid Alemi. “Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow”.,Licenced under MIT Licence,Medium, 15th August. 2017. Web. 23 April. 2018.
The code is based on code by Omid Alemi licenced under MIT
Website: https://github.com/patricieni/RBM-Tensorflow/blob/master/Gaussian%20RBM.ipynb https://github.com/omimo/xRBM/tree/master/examples
[15] MNist dataset Decription is based from this article
Website : http://corochann.com/mnist-dataset-introduction-1138.html https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine
The code has been based on the below blog post with custom addition of functions and illustartions
[16] Kristiadi's,Agustinus. “Generative Adversarial Network using Tensorflow”.,Setember 17th 2016,. Web. 19 April. 2018. Weblinks :
https://wiseodd.github.io/techblog/2016/09/17/gan-tensorflow/
https://github.com/wiseodd/generative-models/blob/master/GAN/softmax_gan/softmax_gan_tensorflow.py
https://wiseodd.github.io/page3/
[17] Back ground reserach
https://en.wikipedia.org/wiki/Generative_adversarial_network
[18] The code has been based on the below blog and custom changes have been made to this code to incorporate teh requirements of the project Sharma,Aditya. “Understanding Autoencoders using Tensorflow”.,LearnOpenCV,November 15th 2017,. Web. 20 April. 2018. Website : https://www.learnopencv.com/understanding-autoencoders-using-tensorflow-python/
[19] Mallick,Satya(spmallick),“Denoising-Autoencoder-using-Tensorflow.”,GitHub,LearnOpenCV,November 26th 2017,. Web. 20 April. 2018. Website:https://github.com/spmallick/learnopencv/tree/master/DenoisingAutoencoder
https://www.tensorflow.org/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits https://www.tensorflow.org/api_docs/python/tf/losses/sigmoid_cross_entropy

This work is licensed under a Creative Commons Attribution 3.0 United States License.